**Feature Extraction -definition**

Given a set of features F = {𝒳1,.....,𝒳N}

the

**Feature Extraction**("Construction") problem is to map F to some feature set F" that maximizes the learner's ability to classify patterns.

Find a projection matrix w from N-dimensional to M-dimensional vectors that keeps error low.

Assume that N features are linear combination of M < N vectors

Zi = Wi1𝒳i1 + ......+ Wid𝒳iN

Z = Wt𝒳

What we expect from such basis

- Uncorrelated cannot be reduced further

- Have large variance or otherwise bear no information.

**Algebraic definition of PCs**

PCA

**PCA for image Compression**

**Is PCA a good criterion for classification ?**

- Data variation determines the projection direction

- What's missing ?

- Class information

**What is a good projection ?**

- Similarly, what is a good criterion ?

- Separating different classes

**What class information may be useful ?**

Between-class distance

- Distance between the centroids of different classes

Within-class distance

- Accumulated distance of an instance to the centroid of its class.

Linear discriminant analysis (LDA) finds most discriminant projection by

- maximizing between-class distance

- and minimizing within-class distance

**Linear Discriminant Analysis**

Find a low-dimensional space such that when 𝓍 is projected, classes are well-separated.

**Means and Scatter after projection**

**Good Projection**

- Means are as far away as possible

- Scatter is small as possible

- Fisher Linear Discriminant

J(w) = (m1 - m2)2 /s12 + s2 square

## No comments:

## Post a Comment