A May 23 headline in USA Today was chilling in its implications. "VA: Data on 26.5M veterans stolen." The article went on to describe how an analyst in the Veteran's Department had taken home a computer with identifiable data on every veteran who had been discharged since 1975. The computer was stolen and tens of millions of veterans now face a dramatically increased risk of identity theft.
It was the height of irony that I was scheduled later that day to make a presentation in which I would outline a proposed national healthcare identifier system that would directly address techniques to prevent exactly such types of mishaps. Nor was it much comfort to know that in the future, we might be able to ensure such episodes would not occur.
The loss of control of one's identity information in the form of items such as name, address, birth date and Social Security number is not a readily repairable error. Once someone with malicious intent obtains information such as this, it is almost impossible to ever restore a person's confidence that their information is no longer at risk. Even if the perpetrator is caught and the data is recovered, it is not possible to be entirely sure that a copy (or multiple copies) of the information has not been made.
Needless to say, this episode should not have occurred. According to "the rules," the departmental analyst should not have taken this information out of the office. But this episode would have been just as damaging if the computer had been stolen from his or her office. And, of course, people never anticipate an event such as a theft might occur.
Blinded by the data
The major point to be learned from this episode is that the analyst should not have been dealing with identifiable data in the first place. There are very few situations where analysis of a large database of person-related data requires knowledge of the specific individuals in that database. Usually what such analysis is looking at are trends, aggregate statistics, correlations between various population segments, and the like.
In general, these types of analysis can be performed without knowing the identities of the persons involved. Information extracted for use in such a database should be blinded. All basic identifying information should be stripped and, instead, each patient should be represented by a unique identification number that can be used to aggregate the needed data items for that record without revealing whom those data items actually describe.
If such a blinded database is accidentally lost, there still is some risk that an unscrupulous user might try to identify individuals through inference, but this is an arduous and time-consuming task which is much less likely to succeed than gaining direct access to such information stored in plain text.
As an additional layer of protection, the data could be encrypted in such a manner that only the use of a secret decryption key could be used to reconstruct it. These two techniques — blinding and encryption — can combine to ensure that episodes such as those described above are not repeated.
Companies need to establish a policy of using blinded and encrypted information wherever the individual identities associated with the information are not necessary for successful processing.
For example, when a hospital attempts to submit a bill for a patient stay, the person preparing and validating the bill before it is submitted does not need to know the identity of the person whose bill they are processing. They need to know that they have complete and accurate information on the episode being billed, but the identity of the person involved is irrelevant. In this case, the data should be linked to a pseudo-identifier in order to prepare the bill.
Preparation is key
It is not possible to foresee all the various ways in which errors occur and mistakes are made. However, it is possible to protect the identity of the people whose information is being processed so that even avoidable mistakes do not lead to severe harm.
While it may not have been possible to anticipate that an analyst would take a huge database of personally identified information home to perform additional work, and that database would subsequently be stolen, it is possible to anticipate that performing analysis on blinded and encrypted data will make it possible to avoid harm despite the occurrence of all sorts of foreseeable and unanticipated situations.
Humans are fallible and, from time to time, each of us does things that in retrospect can only be classified as dumb. We have to be smart enough to design our systems so that mere mortals can use them without causing irreparable harm through their actions. The tools are available; we just need the discipline to use them properly.
Barry Hieb, M.D., is research director with Gartner Healthcare and a member of ASTM E.31.
ASTM on the Case
The ASTM E 31.28 healthcare national standards organization has developed a proposed national healthcare identification system that could readily have been used to avoid incidents such as the VA episode described here.
The proposed system is called a Voluntary National Healthcare Identification (VNHID) System. As its name implies, one of the features that distinguishes this system from previous proposals is that it would be a voluntary system. Individual patients would choose to join the system and be issued an identifier(s) because they have made the choice that the benefits of the system make it reasonable for them to participate. Those that choose not to participate would be free to refrain.
The VNHID would be implemented as a secure Web site that has as its primary function issuing two types of identifiers — open UHIDs (Universal Healthcare IDentifers) and private EUHIDs (Encrypted Universal Healthcare IDentifiers). These identifiers would be issued to any person who requests one through a physician participating in a health system linked to the VNHID.
EUHIDs represent the key to avoiding disasters such as the (unintentional) disclosure of veterans' personal identification information that recently occurred. An EUHID represents an identifier that can be used to link together disparate pieces of patient information without knowing the identity of the person involved. It represents a mechanism to support the blinding function described in the article. If the information in the stolen system had been linked using EUHIDs then, even though the stolen information would be at risk, it would be difficult to impossible to link any of that information back to the individual person involved. In addition, none of the identifying information — name, birth date, social security number, etc. — of any patient would have been revealed.
For more information on the ASTM proposal, see "The Case for a Voluntary National Healthcare Identifier," B. Hieb, Journal of ASTM International, Feb. 2006, Vol. 3, No. 2.