Current Status

Not Enrolled

Price

This module is part of the Clinical Toolkit, which can be purchased for $149.98

Get Started

Understanding Medical Statistics

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Critical Appraisal of Medical Literature

Evidence-Based Medicine Mastery

Critical Appraisal Guide & Interactive Assessment

Study Structure

A "valid" study starts with a logical design. If the question is about therapy, an RCT is the gold standard. If it's about rare exposures, a Case-Control is more appropriate. Use the PICO framework to ensure the study population matches your patient.

The PICO Filter

  • P: Patient/Population (Age, comorbidities, severity)
  • I: Intervention (Drug dose, timing, duration)
  • C: Comparison (Placebo or current standard of care?)
  • O: Outcome (Death, stroke, or just a lab value?)

The Two Pillars of Validity

Determining if a study is "good" requires looking at two distinct lenses.

Internal Validity

Can the results be trusted? Is the difference between groups actually due to the intervention, or just noise and bias?

  • Randomization: Were the groups balanced at the start?
  • Blinding: Did knowledge of treatment affect behavior or assessment?
  • Follow-up: Was it long enough? Did too many patients drop out (Attrition)?

External Validity

Does it matter to my patient? Can these results be generalized to the broader population outside the trial?

  • Generalizability: Does the study population look like mine?
  • Real-world Setting: Was the study done in a high-tech academic center or a setting like yours?
  • Practicality: Is the intervention affordable and accessible?

The "Valid Study" Checklist

Ask these five questions when reading the Methods and Results sections.

1

Was allocation concealed?

This is distinct from blinding. It ensures the person enrolling patients cannot predict which group the patient will enter, preventing "cherry-picking" of healthy patients for the intervention group.

2

Were patients analyzed in the groups they were randomized to?

Look for Intention-to-Treat (ITT). If a study only analyzes patients who completed the full course (Per-Protocol), the results are often overly optimistic and biased.

3

Were the groups similar at baseline?

Check "Table 1". If the intervention group is significantly younger or has fewer comorbidities, the study's validity is compromised from the start.

4

Is the outcome clinically significant?

A "statistically significant" decrease in HbA1c of 0.1% is medically trivial. A "valid" study that changes practice must show an effect that matters to the patient's quality of life or survival.

5

Who funded the study?

Conflicts of interest don't automatically invalidate a study, but industry-funded trials are statistically more likely to report positive findings. High validity requires transparent reporting of funding.

Interpreting Results

RR
Relative Risk

Tells you the magnitude of effect. RR < 1 means the intervention reduced the risk.

ARR
Absolute Risk Reduction

The "honest" number. The literal percentage point difference between groups.

NNT
Number Needed to Treat

1 / ARR. Translates statistics into "How many patients do I need to see to save one?"

Mastery Assessment

20 Questions • Single-question mode

Question 1 of 20 0%

Sensitivity, Specificity, Predictive Values, and Likelihood Ratios

Screening tests (surveillance tests) are tools use to assess the likelihood that a patient may have a certain disease.  They are not definitive, but if positive, will heighten suspicion that would warrant use of a gold standard diagnostic test to rule in or rule out a certain diagnosis.  The goal of screening tests is to reduce the morbidity and mortality in a population group (Maxim, Niebo, & Utell, 2014).  Examples of screening tests include routine EKGs, PSA, PAP smears, and mammograms.  For example, a male with an elevated PSA may have prostate cancer, BPH, or prostatitis.  Positive results of screening tests need to be compared to the established gold standard test that is regarded as definitive.  In this case, a prostate biopsy is considered a definitive test, as it will reveal the etiology of the elevated PSA.  Screening tests are less invasive and less costly, whereas the gold standard test may be more invasive, expensive, or too late (discovered during an autopsy).  Ideally, gold standard tests, such as coronary angiography, breast biopsy, or colposcopy should have 100% sensitivity and specificity.  However, in reality, this may not be the case, as it may be the best test given the clinical picture at the time (Maxim, Niebo, & Utell, 2014).

 

Sensitivity and Specificity

 

Sensitivity and specificity are two measures that evaluate the performance of medical tests.  Sensitivity refers to the ability of a test to correctly identify those patients who have the disease, whereas, specificity refers to the ability of a test to correctly identify those patients who do NOT have the disease (Akobeng, 2007).  A perfect clinical test would yield 100% sensitivity and specificity. However, most tests do not achieve such performance. The terms “sensitivity, specificity, and positive/negative predictive values” all refer the diagnostic utility of a certain test.  It is important to remember that these are based on disease prevalence, which varies depending on the population being tested.

A test that is highly sensitive is useful for ruling out a disease if one test is negative for it.  Likewise, a test that is highly specific is useful for ruling in a disease if one test is positive for it. There are times when a clinician would want a test with low sensitivity and high specificity and vice versa.  For example, the sensitivity and specificity of the nitrate dipstick is 27% and 94%, respectively.  If a woman, whose symptomatology is suggestive of a UTI tests negative, would that mean that she does not have a UTI?  With the sensitivity of the test being low, the clinician cannot tell.  However, if the dipstick came back positive, she may actually have a UTI as the specificity is high (van Stralen et al., 2009).

When assessing sensitivity and specificity, remember the mnemonic:

 

SnNOUT – high sensitivity, Negative test Rule out

SpPIN – high specificity, Positive test Rule in

 

The limitation of utilizing sensitivity and specificity for a clinical test is that they have no practical use to help clinicians estimate the probability that a patient has the disease they are testing for.  If a test result comes back positive, patients would usually ask what his or her chance (i.e. probability) that she/he will have the disease. The converse would be true if the test result comes back negative. These questions assess the positive and negative predictive values (PPV, NPV) of a diagnostic test; that is, they describe a patient’s probability of a having a disease when their result is known.  The drawback of utilizing PPV and NPV is that they vary based on the population chosen and disease prevalence, and should not be transferred from one setting or patient to another (Attia, 2003).   For example, the screening for SLE utilizes the ANA test.   In the general population, this would yield a low PPV.  However, screening for SLE with the same test in patients who present with malar rash and joint pain would yield a high PPV (Lalkhen & McCluskey, 2008).

 

Probability

 

Speaking of probability, let’s assess what the probability would be that a patient has a pulmonary embolus who presents with symptoms of chest pain and dyspnea.  That depends on the patient’s pre-test probability. Pre-test probability is the estimate that a patient has a disease without any test and is based on symptoms, complaints, and or predetermined probability.  For example, what is the pre-test probability that a 19 y.o female has a positive stress test versus a 78 y/o male with diabetes and extensive pack year smoking history? Pre-test probability is based on clinical intuition, clinical experience, known risks factors, and gestalt.  It is based on the prevalence of a disease, which can change, as the population changes. Population can mean just the general population or a defined group of people who are at risk for a particular disease. For example, the incidence of appendicitis is low in the general population, but one can see that the prevalence of appendicitis is much higher for an ED patient who presents with RLQ abdominal pain (The NNT, 2019).  

 

Likelihood Ratios

 

Based on more “testing”, which may include physical assessment, history, ECG, radiologic testing, laboratory workup, a patient’s pretest probability may be low or high, which the clinician may rule out or rule in the disease, respectively.  Thus, if the above patient comes in with complaints of SOB and chest pain and also had surgery 2 weeks ago, what is your pre-test probability of the patient having a PE? The patient’s Well’s score is at least a 4.5 (3 for PE more likely than other alternatives and 1.5 for surgery in last 4 weeks).  We have not assessed their vital signs yet, as that criteria adds to the pre-test probability that a PE is likely. The combination of those factors establishes a moderate probability that a PE is likely (20.5%), with a positive likelihood ratio of 1.3 and a negative likelihood ratio of 0.7% (Family Practice Notebook, 2018).  You will need a definitive test such as a CT angiogram. However, if the pre-test probability is low (that is a Well’s score between 0-2 points) and the patient’s d-dimer is negative, the clinician can exclude PE. 

Based on the result of the additional testing, the clinician will arrive at a new probability of a disease.  This is referred to as the post-test probability.   Applying likelihood ratios allows clinicians to estimate the probability that the patient has the disease in question.  Likelihood ratios (LRs) are not dependent on prevalence of a disease.  They are based on a known sensitivity and specificity of the test.  LRs allow clinicians to quantitate the probability that a patient has the disease, with larger positive LRs indicating that the patient has a greater likelihood of having the disease and vice versa.   The application of Bayes’ theorem provides the clinician with the odds of having or not having a disease.  Bayes’ theorem is the utilization of context in decision making as no test is 100% accurate.   However, if context is applied, it will help increase the accuracy of the diagnostic test.  Bayes’ theorem converts the result of the test into an actual probability.  It relates the actual probability to the test probability.  

Measurements of accuracy of a test is useful, but clinicians have prior assumptions about a patient’s chance of having a disease based on clinical picture, gestalt, and disease prevalence in a population.  This forms the clinician’s impression of the clinical picture (i.e. probability of a disease).  This totality will influence necessary workup to increase the post-test probability of a disease.  While these measures of accuracy and probability help rule in or rule out a disease, they do not override clinical judgment and the “gut” feeling.

 

References:

 

Akobeng, A.K. (2007).  Understanding diagnostic tests 1: sensitivity, specificity and predictive values.  Acta Paediatrica, 96: 338-341.

 

Attia, J.  (2003). Moving beyond sensitivity and specificity: using likelihood ratios to help interpret diagnostic tests.  Aust Preser, 26: 111-113.  

 

Lalkhen, A.G., & McCluskey, A.  (2008). Clinical tests: sensitivity and specificity.  Continuing Education in Anesthesia, Critical Care, & Pain, 8(6): 221-223.

 

Maxin, L.D., Niebo, R., & Utell, M.J.  (2014). Screening tests: a review with examples.  Inhal Toxicol, 26(13): 811-828.

 

Van Stralen, K.J., Stel, V.S., Reitsma., J.B., Dekker, F.W., Zoccali, C., & Jager, K.J.  (2009). Diagnostic methods I: sensitivity, specificity, and other measures of accuracy.  Kidney International, 75: 1257-1263.  

 

Diagnostics and Likelihood Ratios, Explained.  (2019). Retrieved on January 17, 2019 from http://www.thennt.com/diagnostics-and-likelihood-ratios-explained/

 

Pulmonary embolism pre-test probability.  Family Practice Notebook (2018). Retrieved January 21, 2009 from https://fpnotebook.com/Lung/Exam/PlmnryEmblsmPrtstPrblty.htm

 

Module Content

Medical Statistics

Toolkit Agreement

REGISTER FOR THE ONLINE SKILLS AND PROCEDURE WORKSHOP

* EKG Interpretation * Suture workshop * Hands-on procedures * and MORE! *

23 Category I CME credits with additional credits for Emergency Procedures and Pharmacology

Program also includes a tote bag, full sized textbook with reference normal images and room for notes.

Suture training board, instruments, and all supplies are mailed to you directly.

This is a must-take program for anyone in clinical practice!

REGISTER FOR THE ONLINE ADVANCED SUTURING WORKSHOP

* Designed for All Skill Levels * Basic and Advanced Suture techniques * Wound and Laceration Management * Complex laceration repair and special body parts * Other Skin Procedures

12 Category I CME Credits with additional credits for Emergency Procedures and Pharmacology

Includes our special INTERACTIVE advanced suturing trainer with specialized body parts and other procedures!

Includes ear, eyelid, finger, mouth, philtrum, lip, tongue, and nose repair.

Also includes cyst removal, abscess drainage, subungual hematoma removal, nail removal, running subcuticular, irregular (v, y, and x) lacerations, dog ear lacerations, and more.

REGISTER FOR THE 1-DAY ADVANCED SUTURING WORKSHOP

* Designed for All Skill Levels * Basic and Advanced Suture techniques * Wound and Laceration Management * Complex laceration repair and special body parts * Other Skin Procedures

12 Category I CME Credits with additional credits for Emergency Procedures and Pharmacology

Includes our special advanced suturing trainer with specialized body parts and other procedures!

Program also includes coffee and beverage service throughout the day, and lunch.

Discounted hotel rooms are available at checkout for a limited time.

To enter the toolkit you must be a current subscriber!