Skip to main content

LOCATION AND DISPERSION MEASURES

There are two types of measures in statistics:
1. Measures of Location
2. Measures of Dispersion


Measures of Location
Suppose you are asked to compare the height of two students: Anto, who is 170 cm tall and Hasan whose height is 160 cm. Which is higher? You can easily answer that question. But the answer is not that easy if you have to compare the height of two groups of students. Suppose that in a school there are two groups of sport athletes. The first group is a basketball players consisting of 20 students and another group is a group of volleyball players with 25 students. Which one is taller:  the basketball group or volleyball? The difficulty encountered here is that in each group the height of students varies. To answer the second question, generally, a representative value is used. One may use mean (or average) as the representative. As alternatives to mean are median, mode, quartiles, and percentiles. So, to determine which group is higher, we calculate the mean height of basketball players and the mean height of volleyball players. Then, we compare the two means. In general, when we want to compare a particular attribute in several groups, we need a value that represents each group. In statistics, the representative value (which is then called the "center" of data) in statistics is called the measure of the location or in some literature it is also called “central tendency”.


Measures of Dispersion
Suppose there are two small classes, namely Class A and Class B. Each class consists of 5 students whose math test scores are as follows.
Class A: 50, 60, 70, 80, 90
Class B: 70, 70, 70, 70, 70
The mean math scores of class A is ${\bar{x}}_{A} = \frac{50+60+70+80+90}{5} = 70$ and class B ${\bar{x}}_{B} = \frac{70+70+70+70+70}{5} = 70$. It turns out that both classes have the same mean! But, actually, there is a difference in the characteristics of the two classes. Scores ​​in class A are diverse (i.e. the scores vary) while in class B there is no variability, there is no variation. In statistics, to measure the variability of data, measures of dispersion are used. The greater the measures of dispersion, the more diverse the data is. The smaller the measures of dispersion, the less diverse the data is (meaning: not too much variation in the data scores, one data does not differ much from other data). There are many different types of measures of dispersion, such as: 1) range, 2) quartile deviation, 3) variance, 4) standard deviation, 5) mean deviation

Comments

Popular posts from this blog

CONSTRUCTING THE FREQUENCY DISTRIBUTION TABLE

The frequency distribution table is a table that divides data into groups (classes) and shows how many data values occur in each group/class. Below is an example of frequency distribution table. Now we are learning how to create a frequency distribution table. Suppose we have a collection of ungrouped data on last year’s advertising expenditures of 40 logistics companies, recorded in millions Rupiahs. To construct a frequency distribution table of the ungrouped data, apply the following steps. Step 1: Find the range of the data The range (R) is defined as the difference between the largest data and the smallest data. In this case, R = 307 - 242 = 65. Step 2: Determine the number of categories/classes (k) Applying Sturges rule (k = 1 + 3,322 log n, where n = the number of data), we have: $k = 1 + 3.322 \: log \: 40 \approx 6.32$ As the value of k must be a natural number, 6.32 is rounded up to 7, so k = 7. Step 3: Determine the class width (c) To find c, use $

THE QUARTILES AND MEDIAN OF GROUPED DATA

In this post,  we will learn how to determine the quartiles when some quantitative data are presented in a frequency distribution table. For example, we have the following data, showing Flesch Readability Score of 80 monthly bulletin articles published by Britt and Co. Ltd. Find the quartiles of these readability scores. To answer this, first augment the table with a new column to the right of the frequency column, namely Data Numbers column. There are 5 data in the first class, so the class contains data no. 1 to  no. 5. There are 7 data in the second class, so the class contains data no. 6 to no. 12. There are 13 data in the third class, so the class contains data no. 13 to no. 25. Continuing this way, we get the following: In this case, finding the first quartile means finding the  20 th  data, after the data have been ordered from the smallest to the highest (20 = ¼ x 80). Note that the  20 th  data is in the third class (20 is in the range of 13 - 25, as s

CALCULATING THE MEAN OF GROUPED DATA

Sometimes quantitative data are presented in the form of a frequency distribution table (FDT). A typical FDT is as follows. Suppose that the table above presents the duration of 16 cell phone conversations between pairs of teens. There are 2 conversations with duration from 30 seconds to 44 seconds, 3 conversations with duration from 45 seconds to 59 seconds, etc. How do we calculate the mean of the data? Step 1: Determine the midpoint of each class If M i denotes the midpoint of class i, $M_{i} = \frac{LB_{i}+UB_{i}}{2}$ where  LB i  = lower bound of class i and  UB i  = upper bound of class i. The lower bounds of class 1, 2, 3, 4, 5 are 30, 45, 60, 75, 90, respectively and the upper bounds are 44, 59, 74, 89, 104, respectively. Then, $M_{1} = \frac{30+44}{2} = 37$.  Similarly, $M_{2} = \frac{45+59}{2} = 52$. Continuing this way, we have the following table. Step 2: Multiply each class frequency f i by the corresponding class midpoint M i , resulting in f i M