Trending Technology Machine Learning, Artificial Intelligent, Block Chain, IoT, DevOps, Data Science

Recent Post

Search

Saturday, 23 June 2018

Measures of Central Tendency and Dispersion

Summarizing Data through numbers


Measures of Central Tendency

Dispersion

Skew and Kurtosis



Measures of Central Tendency

Data set: 3,4,3,1,2,3,9,5,6,7,4,8
Mean
               3+4+3+1+2+3+9+5+6+7++8 /12  = 4.583
Median 
     1,2,3,3,3,4,4,5,6,7,8,9  Hence Answer = 4
Mode
   The value 3 appears 3 times, and 4 appears 2 times and all other values appear once. Hence 3 is the mode.

Where do we want to use Mean, Median and Mode

Choosing between mean and median
    - Bad outliers
            Errors
            Do not provide a realistic picture of the story
    - Good outlierss
            The story is in the outliers

Mode
    - Useful with nominal variables
    - Multi modal distributions


[ Strategy: Lose 1 rupee everyday on 99% of the days. But on 1% of the days , It gave re. 10,00,00,000. ]

Example :- 
   40% - voted for garbage can at 25th meter mark
   45% - voted for garbage can at 75th meter mark
   15% - uniform between 0 and 100 

Measures of Dispersion 

Data set: 3,4,3,1,2,3,9,5,6,7,4,8
Range (Max-Min)  (9-1 = 8)
Inter Quartile Range: 3rd quartile - 1st quartile (75th Percentile- 25th Percentile) (6.5-3 = 3.5)
Sample Standard deviation

Questions that go with Standard deviation
  - Why do we use the square function on the deviations ? What are its implications ?
  - Why do we work on standard deviation and not the variance ?
  - Why do we average by dividing by N-1 and not N ?

Mean absolute Deviation and its variants
  - Use |𝒳i-𝒳| instead of (𝒳i-𝒳)2
 

No comments:

Post a Comment