edcommstatistics

Posts

RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS

There are lots of uncertainties in our life: monthly revenues of a business, the number of vehicles left in a parking lot everyday, the number of phone calls we get everyday, the money we spent on watching movies every year, names of customers who will enter the queue for the teller, the colours of the cars which are going to enter a toll gate in the next one hour, whether or not the next flight is going to be late, etc. The first three are random variables, but other three are not. Among the things that are uncertain, some are random variables, some are not. So, what characterizes a random variable? Let’s see some definitions. A random variable is a function that associates a real number with each element in the sample space. (Walpole, 1993) A random variable is a function associated with an experiment whose values are real numbers and their occurence in the trials depends on chance. (Kreyszig, 1993) First, the value of a random variable should be real numbers....

SAMPLE PROBLEMS ON THEORETICAL PROBABILITIES

Problems Set I: The Probability of a Single Event Sample Problem #1 An experiment consists of tossing 4 coins simultaneously, once. Find the probability that at least two heads (H) appear. Answer The sample space is S = {HHHH, HHHT, HHTH, HTHH, THHH, HHTT, HTHT, THHT, HTTH, THTH, TTHH, TTTH, TTHT, THTT, HTTT, TTTT}. $\mid S \mid = 16$ The event is E = {HHHH, HHHT, HHTH, HTHH, THHH, HHTT, HTHT, THHT, HTTH, THTH, TTHH} $\mid E \mid = 11$ $P(E) = \frac{\mid E \mid}{\mid S \mid} = \frac{11}{16} = 0.6875$ So, the probability that at least two heads appear is 0.6875. Sample Problem #2 Fifteen cards are numbered from 1 to 15. The experiment consists of picking at random a card from the set of cards. Find the probability of getting a card with a number which is a multiply of 3. Answer The sample space is S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} $\mid S \mid = 15$ The event is E = {3, 6, 9, 12, 15} $\mid E \mid = 5$ $P(E) = \frac{\mid E \mid}{\mid S \mid} = \f...

EXPERIMENT CONCERNING BERNOULLI-LIKE PROCESS

The Bernoulli process must possess the following properties: 1. The experiment consists of n repeated trials. [Each trial is called a Bernoulli trial.] 2. Each trial results in an outcome that may be classified as a success or a failure. 3. The probability of success, denoted by p, remains constant from trial to trial. 4. The repeated trials are independent. Now we are going to do an experiment approximating to a Bernoulli process, i.e. Bernoulli-like process. (If you have a deep understanding about the Bernoulli process, you know why the experiment is not an exact Bernoulli process, but it is only an approximation to the process.) We are going to do 20 Bernoulli-like process. In each process, there are 4 Bernoulli trials (n = 4). Ideally, the same single dice is rolled 4 times by the same person. But for the sake of time efficiency, instead of rolling a dice 4 times in each process, each of the 4 students in a group roll a dice simultaneously. Let’s define that a succe...

CONSTRUCTING THE FREQUENCY DISTRIBUTION TABLE

The frequency distribution table is a table that divides data into groups (classes) and shows how many data values occur in each group/class. Below is an example of frequency distribution table. Now we are learning how to create a frequency distribution table. Suppose we have a collection of ungrouped data on last year’s advertising expenditures of 40 logistics companies, recorded in millions Rupiahs. To construct a frequency distribution table of the ungrouped data, apply the following steps. Step 1: Find the range of the data The range (R) is defined as the difference between the largest data and the smallest data. In this case, R = 307 - 242 = 65. Step 2: Determine the number of categories/classes (k) Applying Sturges rule (k = 1 + 3,322 log n, where n = the number of data), we have: $k = 1 + 3.322 \: log \: 40 \approx 6.32$ As the value of k must be a natural number, 6.32 is rounded up to 7, so k = 7. Step 3: Determine the class width (c) To find c, use $...

THE QUARTILES AND MEDIAN OF GROUPED DATA

In this post, we will learn how to determine the quartiles when some quantitative data are presented in a frequency distribution table. For example, we have the following data, showing Flesch Readability Score of 80 monthly bulletin articles published by Britt and Co. Ltd. Find the quartiles of these readability scores. To answer this, first augment the table with a new column to the right of the frequency column, namely Data Numbers column. There are 5 data in the first class, so the class contains data no. 1 to no. 5. There are 7 data in the second class, so the class contains data no. 6 to no. 12. There are 13 data in the third class, so the class contains data no. 13 to no. 25. Continuing this way, we get the following: In this case, finding the first quartile means finding the 20 th data, after the data have been ordered from the smallest to the highest (20 = ¼ x 80). Note that the 20 th data is in the third class (20 is ...

CALCULATING THE MEAN OF GROUPED DATA

Sometimes quantitative data are presented in the form of a frequency distribution table (FDT). A typical FDT is as follows. Suppose that the table above presents the duration of 16 cell phone conversations between pairs of teens. There are 2 conversations with duration from 30 seconds to 44 seconds, 3 conversations with duration from 45 seconds to 59 seconds, etc. How do we calculate the mean of the data? Step 1: Determine the midpoint of each class If M i denotes the midpoint of class i, $M_{i} = \frac{LB_{i}+UB_{i}}{2}$ where LB i = lower bound of class i and UB i = upper bound of class i. The lower bounds of class 1, 2, 3, 4, 5 are 30, 45, 60, 75, 90, respectively and the upper bounds are 44, 59, 74, 89, 104, respectively. Then, $M_{1} = \frac{30+44}{2} = 37$. Similarly, $M_{2} = \frac{45+59}{2} = 52$. Continuing this way, we have the following table. Step 2: Multiply each class frequency f i by the corresponding class midpoint ...

MORE ON CALCULATING THE VARIANCE

In previous posts, you were introduced to the concept of variance. Now, we are learning more about it. Population variance Formula 1: $\sigma^2 = \frac{\sum_{i=1}^{n} (X_{i} - \bar{X})^2}{n}$ Formula 2: $\sigma^2 = \frac{\sum_{i=1}^{n} {X_{i}}^2}{n} - (\bar{X})^2$ The two formulae give the same result. Example 1 Mr. Ahmad had 6 cell phone counters. In July 2016, the net profits obtained from each counter were: 4, 7, 5, 3, 5, 6 (in millions rupiahs). What was the variance of the net profit of the six counters in July 2016? Answer In this example, we are asked to calculate the variance of the net profits of the six counters owned by Mr. Ahmad in July 2016. The given data are the net profit data of all the counters. So, they are the data of all population members. To solve this problem, firstly calculate the mean. $\bar{X} = Rp \frac{4+7+5+3+5+6}{6} million = Rp \: 5 \: million$ Using Formula 1, calculate the variance. $\sigma^2 = \frac{(4-5)^2+(7-5)^2+(5-5)^2+(3...