Ontology mapper
First Claim
1. One or more computer-readable storage devices having computer-executable instructions embodied thereon that when executed provide a method for facilitating decision support by determining nomenclature linkages between variables in databases that have different ontologies, the method comprising:
- accessing a first document from a first record system having a first ontology;
accessing a second document from a second record system having a second ontology that is different than the first ontology;
determining categorical datatypes of the first document;
determining categorical datatypes of the second document;
based on the categorical datatypes of the first document and the categorical datatypes of the second document, generating a set of textmatrices;
applying latent semantic analysis to the set of textmatrices to determine a latent semantic space associated with at least one first-document variable and at least one second document variable;
specifying a threshold of similarity;
for a first comparison-variable from the at least one first-document variable associated with the latent semantic space;
determining a measure of similarity to a second-comparison variable from the at least one second-document variable associated with the latent semantic space;
performing a comparison of the measure of similarity to the threshold; and
based on the comparison, determining that the measure of similarity satisfies the threshold, associating the first comparison variable with the second comparison variable, and designating the association as a synonymy, wherein the threshold is satisfied if the measure of similarity is greater than the threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods and computer-readable media are provided for facilitating patient health care by providing discovery, validation, and quality assurance of nomenclatural linkages between pairs of terms or combinations of terms in databases extant on multiple different health information systems that do not share a set of unified codesets, nomenclatures, or ontologies, or that may in part rely upon unstructured free-text narrative content instead of codes or standardized tags. Embodiments discover semantic structures existing naturally in documents and records, including relationships of synonymy and polysemy between terms arising from disparate processes, and maintained by different information systems. In some embodiments, this process is facilitated by applying Latent Semantic Analysis in concert with decision-tree induction and similarity metrics. In some embodiments, data is re-mined and regression testing is applied to new mappings against an existing mapping base, thereby permitting these embodiments to “learn” ontology mappings as clinical, operational, or financial patterns evolve.
208 Citations
20 Claims
-
1. One or more computer-readable storage devices having computer-executable instructions embodied thereon that when executed provide a method for facilitating decision support by determining nomenclature linkages between variables in databases that have different ontologies, the method comprising:
-
accessing a first document from a first record system having a first ontology; accessing a second document from a second record system having a second ontology that is different than the first ontology; determining categorical datatypes of the first document; determining categorical datatypes of the second document; based on the categorical datatypes of the first document and the categorical datatypes of the second document, generating a set of textmatrices; applying latent semantic analysis to the set of textmatrices to determine a latent semantic space associated with at least one first-document variable and at least one second document variable; specifying a threshold of similarity; for a first comparison-variable from the at least one first-document variable associated with the latent semantic space; determining a measure of similarity to a second-comparison variable from the at least one second-document variable associated with the latent semantic space; performing a comparison of the measure of similarity to the threshold; and based on the comparison, determining that the measure of similarity satisfies the threshold, associating the first comparison variable with the second comparison variable, and designating the association as a synonymy, wherein the threshold is satisfied if the measure of similarity is greater than the threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system for facilitating decision support by determining nomenclature linkages between variables in databases having different ontologies, comprising:
-
one or more computer processors; and one or more computer storage media storing computer-useable instructions that, when executed by the one or more processors, implement a method comprising; accessing a first document from a first record system having a first ontology; accessing a second document from a second record system having a second ontology that is different than the first ontology; determining categorical datatypes of the first document; determining categorical datatypes of the second document; based on the categorical datatypes of the first document and the categorical datatypes of the second document, generating a set of textmatrices; applying latent semantic analysis to the set of textmatrices to determine a latent semantic space associated with at least one first-document variable and at least one second document variable; specifying a threshold of similarity; for a first comparison-variable from the at least one first-document variable associated with the latent semantic space; determining a measure of similarity to a second-comparison variable from at least one second-document variable associated with the latent semantic space; performing a comparison of the measure of similarity to the threshold; and based on the comparison, determining that the measure of similarity satisfies the threshold, associating the first comparison variable with the second comparison variable, and designating the association as a synonymy, wherein the threshold is satisfied if the measure of similarity is greater than the threshold. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A system for discovering and validating latent relationships in data, comprising:
-
one or more processors; and one or more computer storage media storing computer-useable instructions that, when executed by the one or more processors, implement a method comprising; receiving a plurality of documents from two or more record-keeping systems, wherein the received plurality of documents comprises a set of documents; determining categorical datatypes associated with each of the documents within the set of documents; based on the set of documents, generating a set of textmatricies; applying latent semantic analysis to the set of textmatrices to determine a latent semantic space; specifying a threshold of similarity; for a first-document variable, from a first document, associated with the latent semantic space; determining a measure of similarity to a second-document variable, from a second document, associated with the latent semantic space; performing a comparison of the measure of similarity to the threshold; and based on the comparison, determining that the measure of similarity satisfies the threshold, associating the first-document variable with the second-document variable, and designating the association as a synonymy, wherein the threshold is satisfied if the measure of similarity is greater than the threshold. - View Dependent Claims (20)
-
Specification