11.1: Introduction
- Page ID
- 140447
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Suppose you are a public health researcher trying to determine the average daily protein intake for adults in your town. You might conduct a 24-hour dietary recall for 50 randomly selected residents, record their protein consumption, and average the results together. By doing this, you have obtained a point estimate of the true population mean.
Similarly, if you are a basketball coach trying to determine the percentage of successful free throws an athlete makes under fatigue, you would count the number of baskets made and divide that by the total number of attempts during a training session. In this case, you have obtained a point estimate for the true proportion (p), which is the parameter used in the binomial probability density function. Whether we are measuring grams of protein or successful repetitions of an exercise, these point estimates serve as our best guess for what is happening in the larger population.
We use sample data to make generalizations about an unknown population. This part of statistics is called inferential statistics. The sample data help us to make an estimate of a population parameter. We realize that the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals. What statistics provides us beyond a simple average, or point estimate, is an estimate to which we can attach a probability of accuracy, what we will call a confidence level. We make inferences with a known level of probability.
In this chapter, you will learn to construct and interpret confidence intervals. You will also learn a new distribution, the Student's-t, and how it is used with these intervals. Throughout the chapter, it is important to keep in mind that the confidence interval is a random variable. It is the population parameter that is fixed.
If you worked in a physical therapy clinic, you might be interested in the mean number of patients each month that visit due to an acute injury. If so, you could conduct a view records and calculate the sample mean, \(\bar{x}\), and the sample standard deviation, \(s\). You would use \(\bar{x}\) to estimate the population mean and \(s\) to estimate the population standard deviation. The sample mean, \(\bar{x}\), is the point estimate for the population mean, \(\mu\). The sample standard deviation, \(s\), is the point estimate for the population standard deviation, \(\sigma\).
\(\bar{x}\) and \(s\) are each called a statistic.
A confidence interval is another type of estimate but, instead of being just one number, it is an interval of numbers. The interval of numbers is a range of values calculated from a given set of sample data. The confidence interval is likely to include the unknown population parameter.
Suppose, for the physical therapy example, we do not know the population mean \(\mu\), but we do know the sample mean and that the population standard deviation is \(\sigma=1\) and our sample size is 100. Then, by the central limit theorem, the standard deviation of the sampling distribution of the sample means is
\[\dfrac{\sigma}{\sqrt{n}}=\dfrac{1}{\sqrt{100}}=0.1\]
The Empirical Rule, which applies to the normal distribution, says that in approximately \(95 \%\) of the samples, the sample mean, \(\bar{x}\), will be within two standard deviations of the population mean \(\mu\). For our example, two standard deviations is \((2)(0.1)=0.2\). The sample mean \(\bar{x}\) is likely to be within 0.2 units of \(\mu\).
Because \(\bar{x}\) is within 0.2 units of \(\mu\), which is unknown, then \(\mu\) is likely to be within 0.2 units of \(\bar{x}\) with \(95 \%\) probability. The population mean \(\mu\) is contained in an interval whose lower number is calculated by taking the sample mean and subtracting two standard deviations (2)(0.1) and whose upper number is calculated by taking the sample mean and adding two standard deviations. In other words, \(\mu\) is between \(\bar{x}-0.2\) and \(\bar{x}+0.2\) in \(95 \%\) of all the samples.
For the physical therapy example, suppose that a sample produced a sample mean \(\bar{x}=2\). Then with \(95 \%\) probability the unknown population mean \(\mu\) is between
\[\bar{x}-0.2=2-0.2=1.8 \text { and } \bar{x}+0.2=2+0.2=2.2\]
We say that we are \(95 \%\) confident that the unknown population mean number of acute injuries per 10 patients each month is between 1.8 and 2.2. The 95\% confidence interval is (1.8, 2.2). Please note that we talked in terms of 95\% confidence using the empirical rule. The empirical rule for two standard deviations is only approximately \(95 \%\) of the probability under the normal distribution. To be precise, two standard deviations under a normal distribution is actually \(95.44 \%\) of the probability. To calculate the exact \(95 \%\) confidence level we would use 1.96 standard deviations.
The \(95 \%\) confidence interval implies two possibilities. Either the interval \((1.8,2.2)\) contains the true mean \(\mu\), or our sample produced an \(\bar{x}\) that is not within 0.2 units of the true mean \(\mu\). The first possibility happens for \(95 \%\) of well chosen samples. It is important to remember that the second possibility happens for \(5 \%\) of samples, even though correct procedures are followed.
Remember that a confidence interval is created for an unknown population parameter like the population mean, \(\mu\).
For the confidence interval for a mean the formula would be:
\[\mu=\bar{X} \pm Z_\alpha{ }^\sigma / \sqrt{n}\]
Or written another way as:
\[\bar{X}-Z_\alpha{ }^\sigma / \sqrt{n} \leq \mu \leq \bar{X}+Z_\alpha{ }^\sigma / \sqrt{n}\]
Where \(\bar{X}\) is the sample mean. \(Z_\alpha\) is determined by the level of confidence desired by the analyst, and \(\sigma / \sqrt{n}\) is the standard deviation of the sampling distribution for means given to us by the Central Limit Theorem.


