Thursday, August 28, 2014

In statistics, what would happen to the variance and standard deviation if the highest and lowest values were taken out? Explain why.I think they'd...

In statistics, what would happen to the variance and the
standard deviation if the highest and lowest values were taken
out?


In
general
they will both decrease, except in the case where all
data entries are equal, in which case the variance and standard deviation were zero and
remain zero after removing the two data points.


Consider
the formula for the variance; in this case I would look at the "shortcut"
formula:


`v=(n(sum x^2)-(sum
x)^2)/(n(n-1))`


After throwing out the two data points, the
denominator will decrease by 4n-6. [Take n(n-1) - (n-2)(n-3)]. A reduced denominator
increases the value of a fraction, but the numerator is decreasing also. As long as the
decrease in the numerator is more than the decrease in the denominator, the variance
will decrease.


This is not universally true -- consider the
data 1,1,1,1,1,9,9,9,9,9. If you calculate the variance for this data you get `s(x)~~
4.216`


while the reduced set (eliminating highest and
lowest values) yields `s(x) ~~ 4.276` .


The standard
deviation is the square root of the variance, and since the square root function is
increasing, if the variance shrinks so does the standard
deviation.


So the answer is -- it depends on the data set.
Some unusual sets may show an increase, but generally the variance (and hence the
standard deviation) will decrease.

No comments:

Post a Comment

What is the meaning of the 4th stanza of Eliot's Preludes, especially the lines "I am moved by fancies...Infinitely suffering thing".

A century old this year, T.S. Eliot's Preludes raises the curtain on his great modernist masterpieces, The Love...