# Chebyshev’s Inequality

## Introduction

• Standard Deviation the most widely used method to measure the Dispersion of a set of data. It measures the deviation of elements of a population or dataset from its mean.
• Standard Deviation is calculated as the positive square root of the mean of squared deviations from the mean

### Chebyshev’s Inequality

• According to Chebyshev’s inequality, for any distribution with finite variance, the proportion of the observations within k standard deviations of the arithmetic mean is at least 1 − 1/k2 for all k > 1.
• Let us understand the above statement,
• Mean or the average of a given dataset calculated by dividing the sum of all elements of the dataset by the total number of elements in the dataset.
• Mean = (Sum of dataset elements)/ Total number of elements in the dataset
• Standard Deviation of a dataset is obtained by using the following formula:
• Population Standard Deviation =
• Where D is the difference between the elements of a population and its mean
• And n is the population size
• Let us take k = 2, then,
• As per Chebyshev’s inequality, 1- ¼ = ¾ or 75% of elements should lie in the range of k standard deviation i.e. 2 standard deviations

Illustration 1: or a dataset A if Mean is = 10 and Standard Deviation is 2, then 75% of the elements of the dataset A should lie between 10 ± 2 i.e. between 8 and 12.
• For different values of k, we can get different data that how much percent element of a dataset should lie in the range of k Standard Deviations.

#### Example. 1:

• Let us take a sample: 2,5,8,10,12,15,18, verify Chebyshev’s inequality for k=2.
• Solution: For k =2, around 75% of dataset elements should be in the range of Mean ± 2* Standard Deviation i.e. 10 ± 11.134 i.e. -1.134 to 21.134
• In our sample dataset, all the elements lie in the above range
• Hence, Chebyshev’s Inequality that at least 75% of elements should fall in this range is verified.