edcommstatistics

Posts

Showing posts from July, 2019

CALCULATING THE MEAN OF GROUPED DATA

Sometimes quantitative data are presented in the form of a frequency distribution table (FDT). A typical FDT is as follows. Suppose that the table above presents the duration of 16 cell phone conversations between pairs of teens. There are 2 conversations with duration from 30 seconds to 44 seconds, 3 conversations with duration from 45 seconds to 59 seconds, etc. How do we calculate the mean of the data? Step 1: Determine the midpoint of each class If M i denotes the midpoint of class i, $M_{i} = \frac{LB_{i}+UB_{i}}{2}$ where LB i = lower bound of class i and UB i = upper bound of class i. The lower bounds of class 1, 2, 3, 4, 5 are 30, 45, 60, 75, 90, respectively and the upper bounds are 44, 59, 74, 89, 104, respectively. Then, $M_{1} = \frac{30+44}{2} = 37$. Similarly, $M_{2} = \frac{45+59}{2} = 52$. Continuing this way, we have the following table. Step 2: Multiply each class frequency f i by the corresponding class midpoint ...

MORE ON CALCULATING THE VARIANCE

In previous posts, you were introduced to the concept of variance. Now, we are learning more about it. Population variance Formula 1: $\sigma^2 = \frac{\sum_{i=1}^{n} (X_{i} - \bar{X})^2}{n}$ Formula 2: $\sigma^2 = \frac{\sum_{i=1}^{n} {X_{i}}^2}{n} - (\bar{X})^2$ The two formulae give the same result. Example 1 Mr. Ahmad had 6 cell phone counters. In July 2016, the net profits obtained from each counter were: 4, 7, 5, 3, 5, 6 (in millions rupiahs). What was the variance of the net profit of the six counters in July 2016? Answer In this example, we are asked to calculate the variance of the net profits of the six counters owned by Mr. Ahmad in July 2016. The given data are the net profit data of all the counters. So, they are the data of all population members. To solve this problem, firstly calculate the mean. $\bar{X} = Rp \frac{4+7+5+3+5+6}{6} million = Rp \: 5 \: million$ Using Formula 1, calculate the variance. $\sigma^2 = \frac{(4-5)^2+(7-5)^2+(5-5)^2+(3...

THE MODE

In the previous posts, three measures of location have been discussed, namely the arithmetic mean, the median, and the quartiles. In this post, one more measure of location, the mode, will be discussed. Mode can be applied on data with any level of measurement. If a group of data is given, its mode (denoted by Mo) is the data with the highest frequency of occurrence. In other words, the mode of a data set is the set member that appears most often. Example 1 In a class, there are 30 and 10 male and female students, respectively. There are more men than women, so the mode of sex in the class is male. Example 2 The following is a list of the favourite colours of some kindergarten students. The mode of the favourite colours is red , because red appears most often. Example 3 The following are 13 math test scores data of some junior high school students: 76 47 56 42 78 80 ...

THE QUARTILES

Let x 1 , x 2 , x 3 , ..., x n be the data under consideration and they are ordered so that x 1 ≤ x 2 ≤ x 3 ≤ ... ≤ x n (i.e. they have been sorted from the smallest to the largest). The first quartile ( Q 1 ) and the third quartile ( Q 3 ) of the data are defined as follows. Q 1 = x L where $L = \frac{1}{4} (n+1)$ Q 3 = x U where $U = \frac{3}{4} (n+1)$ Example 1 The following are the grades of a Social Statistics assignment achieved by 15 students. 47 56 71 65 29 68 78 73 80 75 29 38 65 90 95 Find the first and third quartiles of the data. Answer Firstly, sort the data from the lowest to the highest. So we have: 29 29 38 47 56 65 65 68 71 73 75 78 80 90 95 Now, let x 1 = 29, x 2 = 29, x 3 = 38, ..., x 15 = 95 . To determine the first quarti...

THE MEDIAN

When a data set is at ordinal level, we can use median as an alternative to arithmetic mean. We use the alternative measure especially when we cannot calculate the arithmetic mean. For example, we want to compare the clothes size of two groups of two students. Group A Group B To compare which group have larger clothes size, we cannot calculate the mean. Then, how to determine the values that represent the groups’ clothes size? This is the use of the median! To determine the median of a data, the first step is to sort the data from the smallest to the largest. The data located in the middle is the median of the data. If Me is the median of a data set, then the number of data whose value is ≤ Me equals the number of data whose value is ≥ Me. Group A’s sorted clothes size data are: S - M - L - L - XL. The third data from the left, that is L, is the median of Group A’s clothes size. We denote it as: Me = L. Group B’s sorted clothes size data are: S - S - S - M...

LOCATION AND DISPERSION MEASURES

There are two types of measures in statistics: 1. Measures of Location 2. Measures of Dispersion Measures of Location Suppose you are asked to compare the height of two students: Anto, who is 170 cm tall and Hasan whose height is 160 cm. Which is higher? You can easily answer that question. But the answer is not that easy if you have to compare the height of two groups of students. Suppose that in a school there are two groups of sport athletes. The first group is a basketball players consisting of 20 students and another group is a group of volleyball players with 25 students. Which one is taller: the basketball group or volleyball? The difficulty encountered here is that in each group the height of students varies. To answer the second question, generally, a representative value is used. One may use mean (or average) as the representative. As alternatives to mean are median, mode, quartiles, and percentiles. So, to determine which group is higher, we calculate the m...