SYSTEM FOR ANONYMIZING AND AGGREGATING PROTECTED HEALTH INFORMATION
First Claim
1. A system for anonymizing and aggregating protected health information (PHI) from a plurality of data sources, the system comprising:
- a plurality of data hashing appliances each operatively coupled to a respective data source, each hashing appliance configured to receive from the respective data source, one or more patient medical records, each patient medical record containing at least one data element corresponding to confidential protected health information (PHI), and a master record number (MRN) assigned by the respective data source;
each data hashing appliance configured to;
append a salt value to each data element corresponding to confidential PHI in the patient medical record;
generate a hash value for each data element corresponding to salted confidential PHI;
replace the data element corresponding to confidential PHI with the corresponding generated hash value to generate an anonymized patient medical record;
a master patient index server coupled to a data repository, configured to aggregate a plurality of anonymized patient medical records received from the plurality of data hashing appliances under a unique patient identifier;
a vector and cluster matching engine operatively coupled to the master patient index server and the data repository, and configured to determine if the anonymized patient medical record received from respective hashing appliances match the unique patient identifier corresponding to at least a second anonymized patient medical record stored in the data repository, the matching determined by;
generating a comparison vector by comparing the hash values corresponding to the confidential PHI in the received anonymized patient medical record with corresponding hash values in the second anonymized patient medical record;
generating a confidence vector by assigning weights based on predetermined match conditions;
crossing the comparison vector with the confidence vector to obtain a match confidence level;
comparing the match confidence level to a predetermined threshold; and
mapping the received anonymized patient medical record to the unique patient identifier if the confidence level is greater than the predetermined threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
A patient anonymizing system includes a plurality of hashing appliances and data sources. Each appliance receives medical records containing at least confidential protected health information (PHI). A salt value is appended to each confidential PHI, and a hash is generated, which replaces the confidential PHI to generate an anonymized record. A master patient index server aggregates the anonymized records. A vector and cluster matching engine determines if the anonymized record matches a unique patient identifier corresponding to a second anonymized record. A comparison vector is generated by comparing hash values of the confidential PHI with hash values in the second anonymized record, and is crossed with a confidence vector having weights based on match conditions. This produces a match confidence level, which is compared to a threshold. If the threshold is met, the anonymized record is mapped to the unique patient identifier associated with the second record.
74 Citations
20 Claims
-
1. A system for anonymizing and aggregating protected health information (PHI) from a plurality of data sources, the system comprising:
-
a plurality of data hashing appliances each operatively coupled to a respective data source, each hashing appliance configured to receive from the respective data source, one or more patient medical records, each patient medical record containing at least one data element corresponding to confidential protected health information (PHI), and a master record number (MRN) assigned by the respective data source; each data hashing appliance configured to; append a salt value to each data element corresponding to confidential PHI in the patient medical record; generate a hash value for each data element corresponding to salted confidential PHI; replace the data element corresponding to confidential PHI with the corresponding generated hash value to generate an anonymized patient medical record; a master patient index server coupled to a data repository, configured to aggregate a plurality of anonymized patient medical records received from the plurality of data hashing appliances under a unique patient identifier; a vector and cluster matching engine operatively coupled to the master patient index server and the data repository, and configured to determine if the anonymized patient medical record received from respective hashing appliances match the unique patient identifier corresponding to at least a second anonymized patient medical record stored in the data repository, the matching determined by; generating a comparison vector by comparing the hash values corresponding to the confidential PHI in the received anonymized patient medical record with corresponding hash values in the second anonymized patient medical record; generating a confidence vector by assigning weights based on predetermined match conditions; crossing the comparison vector with the confidence vector to obtain a match confidence level; comparing the match confidence level to a predetermined threshold; and mapping the received anonymized patient medical record to the unique patient identifier if the confidence level is greater than the predetermined threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for anonymizing and aggregating protected health information (PHI) from multiple data sources, the method comprising:
-
transmitting, by a data source, a plurality of patient medical records, to a hashing appliance operatively coupled to the data source, each patient medical record containing at least one data element corresponding to confidential protected health information (PHI); the hashing appliance; appending a salt value to each data element corresponding to confidential PHI in the patient medical record; generating a hash value for each data element corresponding to salted confidential PHI; replacing the data element corresponding to confidential PHI with the corresponding generated hash value to generate an anonymized patient medical record; transmitting, by the hashing appliance, a plurality of anonymized patient medical records, to a data repository; aggregating, by the data repository, the plurality of anonymized patient medical records under a unique patient identifier; determining, using a vector and cluster matching engine operatively coupled to the data repository, if the anonymized patient medical record received from the hashing appliance matches the unique patient identifier corresponding to at least a second anonymized patient medical record stored in the data repository, the matching determined by; generating a comparison vector by comparing the hash values corresponding to the confidential PHI in the received anonymized patient medical record with the hash values in the second anonymized patient medical record; generating a confidence vector by assigning weights based on predetermined match conditions; crossing the comparison vector with the confidence vector to obtain a match confidence level; comparing the match confidence level to a predetermined threshold; and mapping the received anonymized patient medical record to the unique patient identifier if the confidence level is greater than the predetermined threshold. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A method for anonymizing and aggregating protected health information (PHI) from multiple data sources, the method comprising:
-
transmitting a plurality of patient medical records, to a hashing appliance, each patient medical record containing at least one data element corresponding to confidential protected health information (PHI); the hashing appliance configured to; append a salt value to each data element corresponding to confidential PHI in the patient medical record; generate a hash value for each data element corresponding to salted confidential PHI; replace the data element corresponding to confidential PHI with the corresponding generated hash value to generate an anonymized patient medical record; aggregating the plurality of anonymized patient medical records under a unique patent identifier; determining if the anonymized patient medical record matches the unique patient identifier corresponding to at least a second anonymized patient medical record, the matching determined by; generating a comparison vector by comparing the hash values corresponding to the confidential PHI in the anonymized patient medical record with the hash values in a cluster of anonymized patient medical records; generating a confidence vector by assigning weights based on predetermined match conditions; crossing the comparison vector with the confidence vector to obtain a match confidence level; comparing the match confidence level to a predetermined threshold; and mapping the received anonymized patient medical record to the cluster if the confidence level is greater than the predetermined threshold.
-
Specification