Reliability and Reproducibility
Core summary
Reliability is the consistency of a measurement — would you get the same result if you measured again? Reproducibility is whether another researcher can replicate your study and reach the same conclusions. Both are essential for trustworthy research.
Detailed explanation
Detailed explanation
A study can be valid (measuring the right thing) but unreliable (inconsistent measurements), or reliable but not valid (consistently measuring the wrong thing). Ideally, research is both. Reliability has several dimensions. Test-retest reliability: does the same instrument give the same result when used on the same person at different times? Inter-rater reliability: do different observers get the same result when measuring the same thing? Internal consistency: do different questions on a questionnaire measure the same underlying concept? Reproducibility is the ability of other researchers, using the same methods and data, to arrive at the same results. The 'reproducibility crisis' in science — where many published findings cannot be replicated — has highlighted the importance of transparent methods, pre-registration, open data, and detailed protocols. The reproducibility crisis is particularly relevant in biomedical research. Estimates suggest that more than half of preclinical research findings may not be reproducible. Contributing factors include small samples, selective reporting, p-hacking (manipulating data until a significant result appears), and inadequate methods descriptions. As a clinician-researcher, you can improve reproducibility by pre-registering your study, publishing your protocol, sharing your data and code when possible, reporting methods in sufficient detail for replication, and following reporting guidelines.
Clinical example
A blood pressure monitor that gives readings of 120, 140, and 110 mmHg on the same patient within minutes has poor reliability. Even if the average is close to the true value, the inconsistency makes individual readings untrustworthy.
Research example
In 2015, the Open Science Collaboration attempted to replicate 100 published psychology studies. Only 36% of replications produced significant results matching the original studies — a finding that shook confidence in the reliability of published research.
Knowledge check
Q1. A measurement tool that gives the same result each time but always overestimates the true value is:
Q2. The reproducibility crisis suggests that most published research findings are fraudulent.
Q3. Inter-rater reliability measures: