Privacy, Confidentiality, and De-identification
Core summary
Privacy is a person's right to control access to themselves and their information. Confidentiality is the researcher's obligation to protect data once collected. De-identification removes identifiers so data cannot be traced back to individuals. These protections are critical for maintaining participant trust and complying with regulations.
Detailed explanation
Detailed explanation
Privacy concerns who has access to a person and their information. Confidentiality concerns how collected data is handled and protected. De-identification is the process of removing or obscuring personal identifiers. Under HIPAA (US), two methods achieve de-identification: Safe Harbor — removing 18 specific identifiers (name, dates, addresses, phone numbers, medical record numbers, Social Security numbers, etc.). Expert Determination — a qualified statistician certifies that the risk of re-identification is very small. Anonymization goes further — the link between data and identity is permanently destroyed, making re-identification impossible. Key data security practices include: storing data on encrypted, password-protected devices; using study ID codes instead of names; keeping the key linking codes to identities in a separate, secured location; limiting access to identified data to essential personnel; using secure data transfer methods; and having a data destruction plan for when the study ends. A data breach can harm participants (identity theft, stigma, discrimination) and can end a researcher's career.
Clinical example
You extract patient data from electronic health records for a retrospective study. You remove all 18 HIPAA identifiers (names, dates shifted by a random interval, zip codes truncated to first 3 digits). You store the de-identified dataset on an encrypted university drive. The linking key is kept in a separate locked cabinet accessible only to you and your PI.
Research example
In 2019, a major academic medical center suffered a data breach exposing research participant information. The institution faced regulatory penalties, lawsuits, and a significant loss of public trust — illustrating why data security is not optional but a core ethical obligation.
Knowledge check
Q1. What is the difference between privacy and confidentiality?
Q2. How many identifiers must be removed under HIPAA Safe Harbor?
Q3. What is the difference between de-identification and anonymization?