CIPT Exam - Question 39

Question

Aadhaar is a unique-identity number of 12 digits issued to all Indian residents based on their biometric and demographic data. The data is collected by the Unique

Identification Authority of India. The Aadhaar database contains the Aadhaar number, name, date of birth, gender and address of over 1 billion individuals.

Which of the following datasets derived from that data would be considered the most de-identified?

Examice · Accepted Answer

A count of the century of birth and hash of the last 3 digits of the person's Aadhaar number provides the most de-identified dataset. The century of birth is a very broad category and does not provide specific information that could easily pinpoint an individual. Additionally, hashing the last 3 digits of the Aadhaar number offers a layer of anonymization, ensuring that even part of the unique identifier is not easily traceable to a specific individual. The other options involve more specific date components and personal identifiers, which could increase the risk of re-identification.

haha2345 · Answer

I think is A.  
Gender and count of year most de-identified.

Stants · Answer

The correct answer is D. A count of the century of birth and hash of the last 3 digits of the person’s Aadhaar number. This option provides the least amount of identifiable information. The century of birth is a very broad category, and hashing the last 3 digits of the Aadhaar number does not reveal much information about the individual. The other options provide more specific information, such as the year, month, or day of birth, and more identifiable aspects of the person’s name or gender, which could potentially be used to re-identify the individual. Therefore, option D is the most de-identified dataset.

CIPT Exam - Question 39

Discussion