Automated mapping of service codes in healthcare systems
First Claim
1. A method for automatic mapping of semantics in healthcare, the method comprising:
- accessing first transaction data of a first healthcare entity in a first database, the first transaction data having a first set of values for a plurality of fields corresponding to a first semantic system;
calculating, by a processor, a first distribution of the first set of values in the first transaction data, wherein the first distribution corresponds to the first semantic system;
accessing second transaction data of a second healthcare entity in a second database, the second transaction data having a second set of values for a plurality of fields corresponding to a second semantic system that is different than the first semantic system;
calculating, by the processor, a second distribution of the second set of values in the second transaction data, wherein the second distribution corresponds to the second semantic system;
comparing, by the processor, the statistical similarity of the first distribution corresponding to the first semantic system and the second distribution corresponding to the second semantic system that is different than the first semantic system with machine learning, wherein comparing includes;
determining a probability that a first field in the plurality of fields corresponding to the first semantic system is a particular field type based on a number of distinct values for the first field,determining a probability that a second field in the plurality of fields corresponding to the second semantic system is the particular field type based on a number of distinct values for the second field, andwhen it is determined that the number of distinct values of the first field of the first semantic system and the number of distinct values of the second field of the second semantic system are not within a predetermined threshold of one another, determining the first field and the second field are not the same particular field type;
automatically outputting, from the machine learning, a map relating syntax of the first transaction data of the first semantic system to syntax of the second transaction data of the second semantic system, the map being a function of the comparing the statistical similarity of the first distribution corresponding to the first semantic system and the second distribution corresponding to the second semantic system;
using the map for semantic interoperability between the first and second semantic systems, communicating information between the first and second healthcare entities; and
updating the map each time new transaction data of the first healthcare entity is accessed, wherein updating the map includes;
re-calculating the first distribution using the new transaction data; and
comparing the statistical similarity of the first distribution, as updated, to the second distribution.
4 Assignments
0 Petitions
Accused Products
Abstract
Automatic mapping of semantics in healthcare is provided. Data sets have different semantics (e.g., Gender designated with M and F in one system and Sex designated with 1 or 2 in another system). For semantic interoperability, the semantic links between the semantic systems of different healthcare entities are created (e.g., Gender=Sex and/or 1=F and 2=M) by a processor from statistics of the data itself. The distribution of variables, values, or variables and values, with or without other information and/or logic, is used to create a map from one semantic system to another. Similar distributions of other variable and/or values are likely to be for variables and/or values with the same meaning.
109 Citations
23 Claims
-
1. A method for automatic mapping of semantics in healthcare, the method comprising:
-
accessing first transaction data of a first healthcare entity in a first database, the first transaction data having a first set of values for a plurality of fields corresponding to a first semantic system; calculating, by a processor, a first distribution of the first set of values in the first transaction data, wherein the first distribution corresponds to the first semantic system; accessing second transaction data of a second healthcare entity in a second database, the second transaction data having a second set of values for a plurality of fields corresponding to a second semantic system that is different than the first semantic system; calculating, by the processor, a second distribution of the second set of values in the second transaction data, wherein the second distribution corresponds to the second semantic system; comparing, by the processor, the statistical similarity of the first distribution corresponding to the first semantic system and the second distribution corresponding to the second semantic system that is different than the first semantic system with machine learning, wherein comparing includes; determining a probability that a first field in the plurality of fields corresponding to the first semantic system is a particular field type based on a number of distinct values for the first field, determining a probability that a second field in the plurality of fields corresponding to the second semantic system is the particular field type based on a number of distinct values for the second field, and when it is determined that the number of distinct values of the first field of the first semantic system and the number of distinct values of the second field of the second semantic system are not within a predetermined threshold of one another, determining the first field and the second field are not the same particular field type; automatically outputting, from the machine learning, a map relating syntax of the first transaction data of the first semantic system to syntax of the second transaction data of the second semantic system, the map being a function of the comparing the statistical similarity of the first distribution corresponding to the first semantic system and the second distribution corresponding to the second semantic system; using the map for semantic interoperability between the first and second semantic systems, communicating information between the first and second healthcare entities; and updating the map each time new transaction data of the first healthcare entity is accessed, wherein updating the map includes; re-calculating the first distribution using the new transaction data; and comparing the statistical similarity of the first distribution, as updated, to the second distribution. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. In a non-transitory computer readable storage medium having stored therein data representing instructions executable by a programmed processor for automatic mapping of service codes in healthcare, the storage medium comprising instructions for:
-
obtaining service codes from transaction data of disparate healthcare entities, each of the disparate healthcare entities utilizing different sematic systems, wherein the service codes are associated with a different semantic system for each of the disparate healthcare entities; calculating one or more distributions for the service codes of each of the different semantic systems, wherein a first plurality of values that correspond to services codes in a first semantic system are used to calculate distributions specific to a first semantic system, and wherein a second plurality of values that correspond to services codes in a second semantic system are used to calculate distributions specific to the second semantic system; determining, for each of the services codes in each of the different semantic systems; a probability that a first service code is a particular type based on a count of distinct values present in the first plurality of values of the first semantic system that are used for the first service code in the transaction data, a probability that a second service code is a particular type based on a count of distinct values present in the second plurality of values of the second semantic system that are used for the second service code in the transaction data, and when it is determined that the count of distinct values of the first service code of the first semantic system and the count of distinct values of the second service code of the second semantic system are not within a predetermined threshold of one another, determining the first service code and the second service code are not the same particular type; creating, by the programmed processor using the one or more semantic-system-specific distributions calculated for the service codes of each of the different semantic systems and the determinations for each of the services codes in each of the different semantic systems, a map relating the plurality of service codes across the different semantic systems, the map created as a function of statistical information that indicates a level of similarity between at least two of the plurality of service codes across the different semantic systems, wherein the map further relates syntax of the first plurality of values that correspond to services codes in the first semantic system to the syntax of the second plurality of values that correspond to service codes in the second semantic system; using the map for semantic interoperability between the different semantic systems, communicating information between the disparate healthcare entities using the map for direct semantic interoperability; and updating the map when new transaction data of the disparate healthcare entities is accessed, wherein updating the map includes; re-calculating the one or more distributions for the service codes, and updating the determinations for each of the services codes in each of the different semantic systems. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A system for automatic mapping of service codes in healthcare, the system comprising:
-
a memory configured to store first medical data in a first representation and store second medical data in a separate second representation, wherein the first medical data corresponds to a first semantic system and the second medical data corresponds to a second semantic system different than the first semantic system; and a processor configured to; determine a level of similarity between values, variables, and values and variables, for each variable in the first medical data and each variable in the second medical data; (1) determine a probability that a first variable of the first medical data corresponding to the first semantic system is a particular field type based on a count of distinct values used for the first variable in the first medical data; (2) determine a probability that a second variable of the second medical data corresponding to the second semantic system is the particular type based on a count of distinct values used for the second variable in the second medical data; (3) determine whether the count of distinct values used for the first variable in the first medical data is within a predetermined threshold of the count of distinct values used for the second variable in the second medical data; (4) when the counts are within the predetermined threshold, increase a probability that the first variable and the second variable are the same particular type; and (5) when the counts are not within the predetermined threshold, increase a probability that the first variable and the second variable are not the same particular type; determine, using the level of similarity between the values, variables, or values and variables, corresponding semantic meanings between the first representation of the first medical data of the first semantic system and second representation of the second medical data of the second semantic system; based on the level of similarity determined and the semantic meanings determined, building a map that relates syntax of the first medical data of the first semantic system to syntax of the second medical data of the second semantic system, the map comprising semantic links between values, variables, or values and variables of the first medical data that corresponds to the first semantic system with values, variables, or values and variables of the second medical data that corresponds to the second semantic system; communicate information between the first and second semantic systems for direct semantic interoperability; and when one or more of the first or second medical data are updated, re-determine the level of similarity between linked values, linked variables, and linked values and variables in view of the one or more of the first or second medical data as updated. - View Dependent Claims (19, 20, 21, 22, 23)
-
Specification