Introduction
- Ram, Arjun, Ravi, Ashok are three friends of the same class and they are awaiting their exam results. All of them got 85% marks which were shocking as their response to the different papers were different, few had written English very well and few had written it bad. So how did this happen?
- Let us have a look at their scorecard of individual subjects which is out of 100:
Subject/Student | English | Hindi | Maths | Science | Total percentage |
---|---|---|---|---|---|
Ram | 70 | 96 | 85 | 89 | 85 |
Arjun | 99 | 45 | 99 | 97 | 85 |
Ravi | 82 | 88 | 80 | 90 | 85 |
Ashok | 60 | 82 | 99 | 99 | 85 |
- It is well known that the total percentage is the average percentage of each subject. Though the total percentage of all the four is the same, the scoring pattern of each of them is very different from each other.
- Hence, Average or Mean gives us detail about the overall picture only and it skips the individual contribution of all the elements.
- In other words, Average gives us information about the size or value of elements of the dataset (total percentage) and not about the spread of the values in elements i.e. how much or how less is the contribution of an element (percentage in each subject).
- A measure of dispersion helps us to overcome the drawback of the Mean observed above, it helps in understanding the contribution of each element in a dataset.
- Dispersion is a measure to find out the extent to which values on element differ from the Mean of dataset i.e. in the above example measure of dispersion will give us an idea that how much score did Ram got in each subject (how much or how less than 85).
- The population is the collection of specified groups of similar objects based on some common parameters. E.g. Residents of State of Maharashtra or All tigers in a Tiger Reserve
- The members of the population are known as Elements of the Population. e.g. Tiger is the element of a population that is defined as all tigers of the Tiger Reserve. The total number of elements in a Population is known as Population Size.
- It is very difficult and time-consuming to apply analysis on a population, hence few elements are taken from the population to form a sample for analysis purposes, in other words a Sample is a subset of the population. E.g. Few Tigers were selected from the tiger reserve for the purpose of health check-up, here the few tigers examined to form a sample. Total number of elements in a Sample is known as Sample Size
- There are various ways to measure the dispersion of a dataset which we will study in upcoming sections.
Semideviation
- Semideviation is defined as the square root of average squared deviation below the Mean.
- The formula to calculate Semi-Deviation for a Population and Sample are given below:
- Population Semideviation =
- Sample Semidevation =
- Where D is the difference between the elements of a population/sample and its mean
- n is the population/sample size
- N is the number of elements which are less than the mean of the population/sample
Example 1:
- Let us take a population: 1,2,5,8,10,12,15,19 and calculate its Semivariance
- Solution:
- Step 1: Calculating Mean:
- Mean = (1+2+5+8+10+12+15+19)/8 = 72/8 = 9
- Step 2: Calculating Deviation from Mean of elements below it, and then taking its Square:
Element | Deviation from Mean | D^{2} |
---|---|---|
1 | -8 | 64 |
2 | -7 | 49 |
5 | -4 | 16 |
8 | -1 | 1 |
- The remaining items of the dataset are greater than Mean, hence excluding them.
- Step 3: Calculating
- = (64+49+16+1)/8 = 130/8 = 16.25
- Step 4: Taking square root of the value obtained in Step 3 to get Population Semideviation:
- Population Semideviation = = 4.03
- If the given dataset will be a sample, the formula will change as given above.
Semivariance
- Like semi deviation, semivariance is also a way to measure the dispersion of a sample/population. It can also be obtained by squaring the semi deviation.
- To calculate semivariance in a sample the above steps must be followed up to step 3, the value obtained in step 3 will be the Variance of the sample i.e. for the above sample, the semivariance is 16.25.
- A Generalised Formula for semivariance is,
- Population Semivariance =
- Sample Semivariance =
- where D is the difference between the elements of a sample/population and its mean
- n is the sample/population size
- N is the number of elements which are less than the mean of the population/sample