Trending Technology Machine Learning, Artificial Intelligent, Block Chain, IoT, DevOps, Data Science

Recent Post

Search

Friday, 6 July 2018

Feature Selection in Machine Learning


Feature Reduction :-

The information about the target class inherent in the variables.

Native view :

More features
⇒ More information
⇒ More better discrimination power

In practice :
- many reasons why this is not the case!

Course of Dimensionality

number of training examples is fixed
 - the classifier's performance usually will degrade for a large number of features !



Feature Selection :-

Given a set of features F = {𝓍1,........𝓍n}
the Feature Selection problem is to find a subset F' ⊆ F that maximizes the learners ability to classify patterns.
Formally F' should maximize some scoring function
 π“1   → 𝓍i1
 π“2   → 𝓍i2
  .             .
  .             .
  .             .
  𝓍n  → 𝓍in

Feature Selection  Steps

Feature selection is an optimization problem
Step 1 : Search the space of possible feature subset.
Step 2 : Pick the subset that is optimal or near-optimal with respect to some objective function.




Search strategies
 - Optimum
 - Heuristic
 - Randomized

Evaluation strategies
 - Filter methods
 - Wrapper methods

Evaluating feature subset

Supervised (Wrapper method)
 - Train using selected subset
 - Estimate error on validation dataset

Unsupervised (Filter method)
 - Look at input only
 - Select the subset that has the most information



Forward Selection
- Start with empty feature set
- Try each remaining feature
- Estimate classification/reg. error for adding each feature
- Select feature that given maximum improvement
- Stop when there is no significant improvement

Backward Search
- Start with full feature set
- Try remaining feature
- Drop the feature with smallest impact an error


Univariate (looks at each feature independently of others)
- Person correlation coefficient
- F-score
- Chi-square
- Signal to noise ration
- mutual information
- Etc.

Rank features by importance
Ranking cut-off is determined by user


Person correlation coefficient

- Measures the correlation between two variables
- Formula for person correlation = 
- The correlation r is between +1 and -1.
  •   +1 means perfect positive correlation
  •   - 1 in the other direction 


Signal to noise ratio

- Difference in means divided by difference in standard deviation between the two classes
                    S2N(X,Y) = (ΞΌx - ΞΌy) / (Οƒx - Οƒy)
- Large values indicate a strong correlation

Multivariate feature selection

- Multivariate (consider all features simultaneously)
- Consider the vector w for any linear classifier.
- Classification of a point x is given by wtx+w0.
- Small entries of w will have little  effect on the dot product and therefore those features are less relevant
- For example if w = (10, 0.1, -9) then features 0 and 2 are contributing more to the dot product than feature 1.
          - A ranking of features given by this w is 0,2,1. 
- The w can be obtained by any of linear classifiers
- A variant of this approach is called recursive feature elimination.
     - Compute w on all features
     - Remove feature with smallest wi
     - Recompute w on reduced data
     - If stopping criterion not met then go to step 2

24 comments:

  1. ExcelR is a glad accomplice of University Malaysia Sarawak (UNIMAS), Malaysia's first state funded college and positioned eighth top college in Malaysia and positioned among top 200th in Asian University Rankings 2017 by QS World University Rankings. data science course in pune

    ReplyDelete
  2. wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.
    Data science Interview Questions

    ReplyDelete
  3. Attend The Business Analytics Courses From ExcelR. Practical Business Analytics Courses Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analytics Courses.
    Business Analytics Courses

    ReplyDelete
  4. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspried me to read more. keep it up.
    Correlation vs Covariance

    ReplyDelete
  5. I like viewing web sites which comprehend the price of delivering the excellent useful resource free of charge. I truly adored reading your posting. Thank you!

    Correlation vs Covariance

    ReplyDelete
  6. You are in point of fact a just right webmaster. The website loading speed is amazing. It kind of feels that you're doing any distinctive trick. Moreover, The contents are masterpiece. you have done a fantastic activity on this subject!
    Business Analytics Course in Hyderabad | Business Analytics Training in Hyderabad

    ReplyDelete
  7. I feel really happy to have seen your webpage and look forward to so many more entertaining times reading here. Thanks once more for all the details.
    Data Science Training in Hyderabad | Data Science Course in Hyderabad

    ReplyDelete
  8. Cool stuff you have and you keep overhaul every one of us

    Simple Linear Regression

    ReplyDelete
  9. After reading your article I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article.

    Data Science Institute in Bangalore

    ReplyDelete
  10. I have to search sites with relevant information on given topic and provide them to teacher our opinion and the article.

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  11. Awesome blog. I enjoyed reading your articles. This is truly a great read for me. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work!
    Data Science Course in Pune
    Data Science Training in Pune

    ReplyDelete
  12. Awesome blog. I enjoyed reading your articles. This is truly a great read for me. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work!
    Data Science Course in Pune
    Data Science Training in Pune

    ReplyDelete
  13. I feel very grateful that I read this. It is very helpful and very informative and I really learned a lot from it.
    Data Analytics Course in Pune
    Data Analytics Training in Pune

    ReplyDelete
  14. I feel very grateful that I read this. It is very helpful and very informative and I really learned a lot from it.
    Data Analytics Course in Pune
    Data Analytics Training in Pune

    ReplyDelete
  15. I see some amazingly important and kept up to length of your strength searching for in your on the site
    Data Science Training in Bangalore

    ReplyDelete
  16. Actually I read it yesterday but I had some thoughts about it and today I wanted to read it again because it is very well written.

    Data Science Course

    ReplyDelete
  17. I was just browsing through the internet looking for some information and came across your blog. I am impressed by the information that you have on this blog. It shows how well you understand this subject. Bookmarked this page, will come back for more.

    Data Science Training

    ReplyDelete
  18. I am impressed by the information that you have on this blog. It shows how well you understand this subject.
    Business Analytics Course in Pune
    Business Analytics Training in Pune

    ReplyDelete
  19. Nice blog. I finally found great post here Very interesting to read this article and very pleased to find this site. Great work!
    Data Science Training in Pune
    Data Science Course in Pune

    ReplyDelete
  20. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete