Trending Technology Machine Learning, Artificial Intelligent, Block Chain, IoT, DevOps, Data Science

Recent Post


Saturday, 23 June 2018

Measures of Central Tendency and Dispersion

Summarizing Data through numbers

Measures of Central Tendency


Skew and Kurtosis

Measures of Central Tendency

Data set: 3,4,3,1,2,3,9,5,6,7,4,8
               3+4+3+1+2+3+9+5+6+7++8 /12  = 4.583
     1,2,3,3,3,4,4,5,6,7,8,9  Hence Answer = 4
   The value 3 appears 3 times, and 4 appears 2 times and all other values appear once. Hence 3 is the mode.

Where do we want to use Mean, Median and Mode

Choosing between mean and median
    - Bad outliers
            Do not provide a realistic picture of the story
    - Good outlierss
            The story is in the outliers

    - Useful with nominal variables
    - Multi modal distributions

[ Strategy: Lose 1 rupee everyday on 99% of the days. But on 1% of the days , It gave re. 10,00,00,000. ]

Example :- 
   40% - voted for garbage can at 25th meter mark
   45% - voted for garbage can at 75th meter mark
   15% - uniform between 0 and 100 

Measures of Dispersion 

Data set: 3,4,3,1,2,3,9,5,6,7,4,8
Range (Max-Min)  (9-1 = 8)
Inter Quartile Range: 3rd quartile - 1st quartile (75th Percentile- 25th Percentile) (6.5-3 = 3.5)
Sample Standard deviation

Questions that go with Standard deviation
  - Why do we use the square function on the deviations ? What are its implications ?
  - Why do we work on standard deviation and not the variance ?
  - Why do we average by dividing by N-1 and not N ?

Mean absolute Deviation and its variants
  - Use |𝒳i-𝒳| instead of (𝒳i-𝒳)2

No comments:

Post a Comment