Skip to main content
Medicine LibreTexts

4.15.1: Observational Studies

  • Page ID
    124751
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Observational studies

    The researcher always records information on the subjects, never interfering in any way. Below are types of observational studies most used in analyzing data to help aid in determine where the need for intervention to bring health to the population:

    Ecological studies are very basic statistical analyses of incidence or prevalence of some disease, risk factor or protective factor within a population or group. These types of studies are helpful in identifying a potentially vulnerable population, and/or the need for more research for particular risk factors.

    As an example, an ecological study might find that counties with more office workers also had higher rates of back pain. We couldn’t conclude any causation from this type of study. There could be many reasons that office workers have higher rates of back pain. One, it could be that office work causes back pain. Two, it could be that people with back pain change careers so that they can work in an office. Three, it could be that office workers have better healthcare plans and therefore go to the doctor more often for musculoskeletal issues like back pain, and therefore it has been recorded more often than other types of workers. The point is that there could be many indirect ways these two things are associated, or there could be no association at all.

    The example above goes illustrates that ecological studies are helpful at generating ideas for populations of interest or possible relationships between risks or protective factors and health outcomes. Causation is not proven.

    Cross-sectional studies take a snapshot of a population at one point in time.

    This can be imagined similarly to a cross-sectional view of a limb in anatomy; it’s almost like you’re taking a “slice” of the population and looking at their current prevalence of specific risks or protective factors and health outcomes. A lot of times we may want to do a cross-sectional study to identify a specific need in a population. For example, we might do a cross-sectional study to identify how many office workers in a particular area currently have low back pain. Perhaps we survey these workers or offer a screening at each office location. The difference between an ecological study and a cross-sectional study is that the ecological study looks at prevalence statistics in groups, whereas a cross-sectional study uses individual data, so in our example, we can be sure whether it’s the office workers who have low back pain.

    Cohort (also known as longitudinal studies) studies are typically much more expensive and time-consuming than other observational studies. The cohort study starts by identifying risk factors - or exposures - in a group (or cohort) of people and then follows this group over time to see what health outcomes occur. Thus, cohort studies are typically "prospective", which means tracking data forward in time with follow-up data collection on the same cohort. There are “retrospective” cohort studies, but these still begin with identifying exposures or risk factors from historical data in a particular group, and then following up with this group later on to determine outcomes (Celentano & Szklo, 2018).

    Case-control (also known as retrospective) studies researchers observe existing groups (cases and controls) to identify potential associations between exposures and outcomes.

    In this study it starts by identifying a group of people who have experienced a specific outcome, and then looking for specific exposures, health behaviors or other factors to try to determine the cause or causes. Looking backwards like this is termed "retrospective". Although we can perform other studies that are retrospective, case-control studies are always retrospective because they start with the outcome and work backwards.

    Let’s say for example we wanted to find out if smoking was a risk factor for lung cancer. For a case-control study, we would start by identifying cases - perhaps find a group of lung cancer patients at a hospital - and then ask them whether or not they had ever smoked. If we found that a large number of them had smoked, we might think that this indicates that smoking is a risk factor for lung cancer.

    But wait - what if lung cancer rates were also high in non-smokers too? This identifies the need for the control part of the case-control study. We’ll have to find similar people for our control group who don’t have lung cancer, and ask them if they have ever smoked. If the number of smokers is much greater in the cancer group vs. the non-cancer group, then we might conclude there is an association between smoking and lung cancer after all.

    Sometimes a single exposure of any amount is the risk factor, but other times the “devil is in the dose”. If there is a dose-response relationship between smoking and lung cancer, then those people who smoked more would have a higher risk of lung cancer. So perhaps instead of asking our cases and our controls whether or not they had ever smoked, we would ask them how many cigarettes they smoked per day, and for how many years. We would probably also want to know if they had quit smoking, and if so, how long ago.

    There can be many issues that a researcher has to consider when performing a case-control study. We might have a selection bias for example. Perhaps the control group was selected from the same hospital and the controls were exposed to second-hand smoke or air pollution in that city or neighborhood. Perhaps the controls aren’t similar enough to the cases in terms of age, sex, socioeconomic status, etc. If we were to compare a young control group with an older case group for example, the younger controls might just not have developed the disease yet. Thus, the selection of the controls can cause a bias in the research.

    Another issue is that researchers must control for other variables that could influence the outcome, instead of the one being studied. For example, if a researcher wanted to find out if coffee consumption decreased the risk of obesity, they might find a case group of individuals with obesity, and a control group of individuals without obesity, and then ask them about coffee consumption. Maybe those people who drink coffee tend to also exercise or eat more pastries. If the researchers do not ask their study subjects about those habits, those could be affecting the person’s body weight independent of the coffee consumption. Yet another pitfall is called reverse causality; when the variable is actually caused by the outcome. Taking the coffee/obesity example, it could be that people who have obesity often try to lose weight, and may do so by consuming coffee (since it is a mild stimulant). Therefore, it wouldn't be the coffee consumption that had an effect on obesity, but obesity that had an effect on coffee consumption.

    Last but not least, all humans tend to have some recall bias. As case-control studies often depend on surveys or interviews, it’s important to identify which participants might be inclined to remember or not remember certain details, or perhaps fudge some of what they report on a survey. For example, many people overestimate their height and underestimate their weight on self-reported data (Jain, 2010). If the study uses self-reported height and weight to determine obesity, then this data could be inaccurate. When someone has received a diagnosis, they are often “looking” for something in their past that caused it, which might make them more likely to remember specific things that happened to them, simply because they are “looking” through their memory for a cause (Lewallen, 1998).

    Case-control studies are often considered simple and inexpensive to perform, since data can be gathered from interviews, surveys, or databases. They can help a researcher develop or confirm a suspicion, and if the study is designed well, they may even allow for an association to be inferred. However, typically they need to be replicated and/or bolstered by other types of studies (like randomized controlled trials) in order for a causal association to be confirmed. Many times, case-controlled studies may actually contradict each other, depending on the methods used and how well the researchers controlled for biases and other variables. It’s important to review the methods used and keep in mind common biases and confounding factors when reading a case-control study article.

    Reference

    Celentano, D. D., & Szklo, M. (2018). Gordis epidemiology. Elsevier Health Sciences.

    Jain, R. B. (2010). Regression models to predict corrected weight, height and obesity prevalence from self-reported data: Data from BRFSS 1999–2007. International Journal of Obesity, 34(11), 1655–1664. https://doi.org/10.1038/ijo.2010.80

    Lewallen, S., & Courtright, P. (1998). Epidemiology in practice: case-control studies. Community eye health. vol. 11,28: 57-8. https://www.researchgate.net/publica...ontrol_Studies


    4.15.1: Observational Studies is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?