Process and system for high precision coding of free text documents against a standard lexicon
First Claim
1. A computer implemented method for assigning codes from a standard lexicon to a free text document describing physical or tangible objects, the method comprising the steps of:
- (a) automatically segmenting said free text document into a plurality of sentences;
(b) using a computer processor to retrieve a plurality of propositions by matching said sentences in a semantic mapping table created by domain experts through semantically annotating sentences from a corpus of related documents in a knowledge domain to propositions,(c) using a computer processor to retrieve a plurality of codes in a standard lexicon by matching said propositions to said codes created by a third party, in a code mapping table created by domain experts by annotating said propositions to said codes;
wherein one or more of said matching codes from said standard lexicon represents at least a portion of the semantic content of said free text document.
1 Assignment
0 Petitions
Accused Products
Abstract
Coding free text documents, especially in medicine, has become an urgent priority as electronic medical records (EMR) mature, and the need to exchange data between EMRs becomes more acute. However, only a few automated coding systems exist, and they can only code a small portion of the free text against a limited number of codes. The precision of these systems is low and code quality is not measured. The present invention discloses a process and system which implements semantic coding against standard lexicon(s) with high precision. The standard lexicon can come from a number of different sources, but is usually developed by a standard'"'"'s body. The system is semi-automated to enable medical coders or others to process free text documents at a rapid rate and with high precision. The system performs the steps of segmenting a document, flagging the need for corrections, validating the document against a data type definition, and looking up both the semantics and standard codes which correspond to the document'"'"'s sentences. The coder has the option to intervene at any step in the process to fix mistakes made by the system. A knowledge base, consisting of propositions, represents the semantic knowledge in the domain. When sentences with unknown semantics are discovered they can be easily added to the knowledge base. The propositions in the knowledge base are associated with codes in the standard lexicon. The quality of each match is rated by a professional who understands the knowledge domain. The system uses this information to perform high precision coding and measure the quality of the match.
-
Citations
25 Claims
-
1. A computer implemented method for assigning codes from a standard lexicon to a free text document describing physical or tangible objects, the method comprising the steps of:
-
(a) automatically segmenting said free text document into a plurality of sentences; (b) using a computer processor to retrieve a plurality of propositions by matching said sentences in a semantic mapping table created by domain experts through semantically annotating sentences from a corpus of related documents in a knowledge domain to propositions, (c) using a computer processor to retrieve a plurality of codes in a standard lexicon by matching said propositions to said codes created by a third party, in a code mapping table created by domain experts by annotating said propositions to said codes; wherein one or more of said matching codes from said standard lexicon represents at least a portion of the semantic content of said free text document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system for assigning codes from a standard lexicon to a free text document, codes, comprising:
-
(a) an input queue of documents; and (b) a segmentation module for processing free text into headers and sentences; and (c) a proposition look up engine that associates said segmented sentences with a plurality of propositions using a semantic mapping table created by domain experts through a process of semantically annotating sentences from a corpus of related documents in a knowledge domain to propositions; and (d) a code look up engine that associates said propositions with a plurality of codes from a standard lexicon using a code mapping table created by domain experts through a process of annotating said propositions to said codes. - View Dependent Claims (20, 21, 22, 23, 24)
-
-
25. Computer readable media having computer readable instructions for assigning codes from a standard lexicon to a free text document, the instructions comprising:
-
(a) instructions for segmenting said free text document into a plurality of sentences; (b) instructions for retrieving a plurality of propositions by matching said sentences in a semantic mapping table created by domain experts through-semantically annotating sentences from a corpus of related documents in a knowledge domain to propositions, (c) instructions for retrieving a plurality of codes in a standard lexicon by matching said propositions, to said codes created by a third party in a code mapping table created by domain experts by annotating said propositions to said codes; wherein one or more of said matching codes from said standard lexicon represents at least a portion of the semantic content of said free text document.
-
Specification