Skip to main content

Section 1.8 Sampling

Subsection 1.8.1 The Law of Averages (Ch 16,17)

In Chapter 16, the text says that with repeated chance processes, the absolute error goes up as the number of draws increases, and the relative error goes down as the number of draws increases. The authors call this statement the "Law of Averages". The language is vague, but the authors presume that the statements are intuitive, and the reader is supposed to get the meaning through examples in the reading. Using the square root law from Chapter 17, we can clarify the meaning of the Law of Averages.
In a box model, with random draws (taken with replacement), we have
\begin{align*} \text{ expected(sum of draws) }\amp = (\text{average of the box}) \cdot (\text{number of draws})\\ \text{ SE(sum of draws) } \amp = (\text{SD of the box}) \sqrt{\text{number of draws}}. \end{align*}
The absolute error and the relative error are
\begin{align*} \text{ absolute error }\amp = (\text{actual sum of draws)} -(\text{expected(sum of draws)})\\ \text{ relative error } \amp = \frac{ \text{absolute error} }{\text{number of draws}}. \end{align*}
The absolute error is estimated by
\begin{equation*} \text{ absolute error }\approx \text{ SE(sum of draws) }=(\text{SD of the box}) \sqrt{\text{number of draws}}. \end{equation*}
Dividing both sides of the last equation by \(\text{(number of draws)}\text{,}\) we have
 1 
For the final "equals" on the right, use this algebra fact:
\begin{equation*} \frac{1}{\sqrt{A}}=\frac{1}{\sqrt{A}}\frac{\sqrt{A}}{\sqrt{A}}=\frac{\sqrt{A}}{A}. \end{equation*}
Apply this fact to \(A=\text{(number of draws)}\text{.}\)
\begin{equation*} \text{ relative error }\approx \frac{(\text{SD of the box})\sqrt{\text{number of draws}} }{\text{number of draws}} = \frac{(\text{SD of the box}) }{\sqrt{\text{number of draws}}}\text{.} \end{equation*}
Summarizing, we have
\begin{align*} \text{ absolute error }\amp \approx (\text{SD of the box}) \sqrt{\text{number of draws}}\\ \text{ relative error }\amp \approx \frac{(\text{SD of the box}) }{\sqrt{\text{number of draws}}}. \end{align*}
From the last two equations, we can see clearly that the absolute error goes up and the relative error goes down as the number of draws increases.

Subsection 1.8.2 A special case SD formula (Ch 17)

Suppose that all of the entries of a list
\begin{equation*} X=A,A,A,\ldots,A,B,B,B,\ldots, B \end{equation*}
are one of two values \(A\) and \(B\text{,}\) with \(A\lt B\text{,}\) and suppose there are \(r\) entries with value \(A\) and \(s\) entries with value \(B\text{,}\) so that the total number of entries in the list \(X\) is \(r+s\text{.}\) The SD of the list \(X\) is given by the following formula.
\begin{equation} \text{SD}(X)= (B-A)\sqrt{\left(\frac{r}{r+s}\right)\left(\frac{s}{r+s}\right)} = (B-A)\frac{\sqrt{rs}}{r+s}\tag{1.8.1} \end{equation}
In the Freedman text (p.298), the number \(B\) is called the "big number", the number \(A\) is called the "small number", the fraction \(s/(r+s)\) is called the "fraction with the big number" and the fraction \(r/(r+s)\) is called the "fraction with the small number".

Subsection 1.8.3 Sum of Draws Practice Problems (Ch 16–18)

Exercises Exercises

Problems 1–10 below are about the following game. A fair die is rolled 600 times. On each roll, you win $4 if you get a 6. You lose $1 if you get something different from a 6.
1.
The total winnings will be about $ give or take $ or so.
Answer.
The box model has six tickets: (1) 4 and (5) \(-1\)’s.
\begin{align*} \AVE(\text{box}) \amp = -1/6\\ \SD(\text{box}) \amp = 5\sqrt{\frac{1}{6}\cdot \frac{5}{6}} \approx 1.86\\ \text{expected(sum) }\amp = -100\\ \text{SE(sum) } \amp \approx 45.64 \end{align*}
2.
The chance of breaking even or better is about .
Answer.
\begin{align*} z \amp \approx \frac{0 - (-100)}{45.64} \approx 2.19\\ P\amp \approx 1.5\% \end{align*}
3.
The number of sixes rolled will be about give or take .
Answer.
The box model has (1) 1 and (5) 0’s.
\begin{align*} \AVE(\text{box}) \amp = 1/6\\ \SD(\text{box}) \amp = \sqrt{\frac{1}{6}\cdot \frac{5}{6}} \approx .373\\ \text{expected(sum) } \amp = 100\\ \text{SE(sum) }\amp \approx 9.13 \end{align*}
4.
The chance that the number of sixes will be ten or more above the expected number is about .
Answer.
\begin{align*} z \amp \approx \frac{110 - 100}{9.13} \approx 1.1\\ P \amp \approx 14\% \end{align*}
5.
The sum of the numbers on the die face on all the rolls will be about give or take .
Answer.
The box model has the tickets 1,2,3,4,5,6.
\begin{align*} \AVE(\text{box}) \amp = 3.5\\ \SD(\text{box}) \amp \approx 1.71 \text{ (no shortcut!!)}\\ \text{expected(sum) } \amp = 2100\\ \text{SE(sum) } \amp \approx 41.83 \end{align*}
6.
The chance that the sum will be in the range 2050--2150 is about .
Answer.
\begin{align*} z \amp \approx \frac{2150 - 2100}{41.83} \approx 1.2\\ P \amp \approx 77\% \end{align*}
7.
There is about a 50% chance that the sum will be in the range 2100 plus or minus .
Answer.
\begin{gather*} z \text{ for 50% is about } .67\\ \text{range is } 2100 \pm 28 \end{gather*}
8.
The total number of even rolls will be about give or take .
Answer.
One possible box model has (3) 1’s and (3) 0’s. A simpler box model has (1) 1 and (1) 0 with equal probabilities of 1/2 each.
\begin{align*} \AVE(\text{box}) \amp = 1/2\\ \SD(\text{box})\amp = 1/2\\ \text{expected(sum) } \amp = 300\\ \text{SE(sum) } \amp \approx 12.25 \end{align*}
9.
The chance that the number of even rolls will be off by 20 or more from the expected number is about .
Answer.
\begin{align*} z \amp \approx \frac{320 - 300}{12.25} \approx 1.63\\ P \amp \approx 10\% \end{align*}
10.
On every 6th roll (that is, on roll numbers 6, 12, 18, etc) you win a blue marble if exactly half of the last 6 rolls came up even. The chance that you win 35 or more blue marbles in 600 rolls is . Use the continuity correction to be as precise as possible.
Answer.
The probability of exactly 3 evens in 6 rolls is \({6 \choose 3}\left(\frac{1}{2}\right)^6 = 31.25\%\) (using the binomial probability formula). A box model has 3125 1’s and 6875 0’s, with 100 draws. The average of the box is \(.3125\) and the SD of the box is \(\sqrt{.3125\cdot .6875} \approx .464\text{.}\) The expected sum is \(31.25\) and the SE for the sum is \(4.64\text{.}\) The \(z\) value for 35 blues is
\begin{equation*} z = \frac{34.5-31.25}{4.64}\approx .70. \end{equation*}
The normal curve table has an area of about \(52\%\) for \(z=.70\text{,}\) so the probability is about \((100-52)/2=24\%\) (if you do not use the continuity correction, you get \(z=.81\) and a probability of about \(21\%\)).