Section 1 : Dispersion part 3
 

[variance][standard deviation ]

[grouped data][skewness][outliers]

 

 

 

 

Variance - var(x)

Variance is a measure of dispersion(the spread of data).

The variance of a random variable x is given by:

definition of variance

Where,

xr is any value of the random variable x
μ is the mean value of x
n is the number of values of x
Σ means the sum of all values in the brackets

The variance can also be expressed in terms of the standard deviation σ :

variance in terms of standard deviation

 

back to top

Standard deviation - symbol σ (sigma)

Like variance, standard deviation is also a measure of dispersion.

From the equations above, it follows that :

definition of standard deviation

and,

standard deviation in terms of variance

Example

Calculate the variance and standard deviation of the following data set of 10 numbers(n = 10). Answer to 3 significant figs.

1, 1, 3, 4, 4, 5, 7, 7, 9, 10

the mean, variance equation #4

example - sum of difference with mean = (1 - 5.1) + (1 - 5.1 ) + (3 - 5.1) + (4 - 5.1)
+ (4 - 5.1) + ( 5 - 5.1) + (7 - 5.1 ) + (7 - 5.1 ) + ( 9 - 5.1)
+ (10 - 5.1)
                    = (-4.1) + (-4.1) + (-2.1) + (-1.1) + (-1.1) + (-0.1) + (1.9) + (1.9) + (3.9) + (4.9)

variance equation #6 = 16.81 + 16.81 + 4.41 + 1.21 + 1.21 + 0.01
+ 3.61 + 3.61 + 15.21 + 24.01
                    = 86.9

variance equation #7


variance(σ2) is 8.69

standard deviation(σ) is √8.69, that is 2.94788 or 2.95 (3sf)

 

back to top

 

Variance and Standard Deviation for grouped data

Recalling that for grouped data,

mean for grouped data

The definition of variance(σ2) for grouped data is :

variance for grouped data

Where x is the mid-value for each group of data.

Example The grouped data in the table represents the exam marks(m) and their frequency(f) for 100 students. Estimate the variance and standard deviation to 2 decimal places.

m

0≤m<20

20≤m<40

40≤m<60

60≤m<80

80≤m<100

f

3

19

51

22

5

 

m

mid-interval value x

f

fx

x2

fx2

0≤m<20

10

3

30

100

300

20≤m<40

30

19

570

900

17100

40≤m<60

50

51

2550

2500

127500

60≤m<80

70

22

1540

4900

107800

80≤m<100

90

5

450

8100

40500

sum Σ 

100

5140

293200

The estimated mean is given by :

variance equation #9

The variance is given by :

variance for grouped data

                variance equation #10

            variance equation #11

The variance is 290.04

The standard deviation is 17.03 ( √290.04 )

 

back to top

 

Skewness

skewness is the degree to which a normal distribution is distorted.

A normal (or Gaussian) distribution is a symmetrical curve, with a central maximum.
The mean, mode and median occur at one point along the x-axis, corresponding to the central maximum.

standard normal distribution

Where SD stands for standard deviation σ(sigma):

68.2% of values are 1 SD  from the mean
95.4% of values are 2 SD from the mean
99.6% of values are 3 SD from the mean

When a distribution is skewed the curve is no longer symmetrical. The central maximum is moved either to the right or the left.

positive & negative skewness

A positive skew is when the right tail is longer. The central maximum is to the left of the figure and the mean is greater than the mode.

A negative skew is when the left tail is longer. The central maximum is to the right of the figure and the mean is less than the mode.

Skewness can be simply measured using either :

The Pearson Mode Coefficient of skewness

skewness - equation #1

The Pearson Median Coefficient of skewness

skewness - equation #2

The resulting number obtained from each method is the same.

Another method of measuring skewness concerns quartiles (Q1 Q2 Q3 ).

skewness - equation #3

 

back to top

 

Outliers are observations that appear to deviate markedly from other members of the sample in which they occur.

outliers examples

 

For computing 'line of best fit' and other statistical operations, good practice is to discard outliers before processing data.

 

back to top

 

your stop for the best in math, science & programming tutorials on the Net revision help to get a better result incremental success advanced physics for secondary/high school, including much in-depth content common to first year university courses your one stop for the best in math, science and programming tuition revision help for a better result incremental success advanced physics for high school/secondary and 1st year university fast-track learning for everyone

[ PURE MATHS ][ MECHANICS ][ STATISTICS ]

VIDEO

the mean
the median
stand. deviation 1
stand. deviation 2
stand. deviation 3
z-scores
confidence interval
goodness of fit
distrib. sample mean
t interval
chi-squared test
 
more...Video Library
 

INTERACTIVE

normal distribution
mean,median comprd 1
mean,median comprd 2
type I & II errors
linear regression
histogram,box whisker
 
 

EXAM PAPERS(.pdf)

Edxl S1 Statistics spec.
Edxl S1 Statistics ans.
Edxl S2 Statistics spec.
Edxl S2 Statistics ans.
Edxl S3 Statistics spec.
Edxl S3 Statistics ans.
 

TOPIC NOTES(.pdf)

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Google