Skip to main content

Human Data Types

The table contains data types, their definitions, and research context, as well as their WVU Research Risk/Classification. 
Type Definition Research Context and Other Information WVU Research Risk/Classification
Identifiable Private information for which the participant's identity is or may be established by the researcher or associated with the information.
  • HIPAA
    Information that includes personal identifiers. The 18 HIPAA Identifiers or any subset of health information that identifies the individual can reasonably be used to identify the individual. Information from the WVU Health System EMR or EMRs from an external source.
  • Information collected from participants
High
(Sensitive)
Indirectly Identifiable HIPAA: Information that can be combined with other information to potentially identify a specific individual.
  • HIPAA designates the following as indirect identifiers: 
    • city
    • state
    • zip codes
    • elements of dates and other numbers
    • characteristics or codes not HIPAA-designated as direct identifiers
High
(Sensitive)
De-Identified Data in which identifying information is removed permanently. Data can never be re-identified. Under HIPAA Privacy Rule, data are de-identified if either:
  • Expert Determination
    An experienced expert determines that the risk that certain information could be used to identify an individual is "very small" and documents and justifies the determination
  • OR
  • Safe Harbor
    The data do not include any of the 18 identifiers (of the individual or his/her relatives, household members, or employers), which could be used alone or in combination with other information to identify the subject. Note that even if these identifiers are removed, the Privacy Rule states that information will be considered identifiable if the covered entity knows that the identity of the person may still be determined. An expert is not involved in the Safe Harbor method.
Low
(Public)
Anonymous Data collected without identifiers and never linked to an individual. The researcher has NO way to link the data to a participant.
The researcher may know who the participant is but cannot link a response to a participant. Low
(Public)
Anonymized Previously identifiable data (indirectly or individually identifiable) that have been de-identified and for which a code or other link no longer exists. A researcher has NO way to link anonymized data back to a specific participant. Anonymized data IS NOT the same as anonymous, coded, or de-identified data.

Low
(Public)
Coded Data are separated from personal identifiers through use of a code. As long as a link exists, data are considered indirectly identifiable and not anonymous, anonymized or de-identified.
Coercion - Persuasion (i.e., of an unwilling person) to do or to agree to something by using obvious or implied force or threats.
WVU considers coded data from the EMR as sensitive data. High
(Sensitive)
Limited Data Set A HIPAA term for a data set where some PHI (18 identifiers) remain. All the following identifiers must be removed in order for health information to be a limited data set:
  • Names
  • Street addresses
    (other than town, city, state and zip code)
  • Telephone numbers
  • Fax numbers
  • Email addresses
  • Social Security numbers
  • Medical record numbers
  • Health plan beneficiary numbers
  • Account numbers
  • Device identifiers and serial numbers
  • URLs
  • IP address numbers
  • Biometrics identifiers
    (including finger and voice prints)
  • Full face photos
    (or comparable images)
High
(Sensitive)