Process and system for high precision coding of free text documents against a standard lexicon

US 7,610,192 B1
Filed: 03/22/2006
Issued: 10/27/2009
Est. Priority Date: 03/22/2006
Status: Expired due to Fees

First Claim

Patent Images

1. A computer implemented method for assigning codes from a standard lexicon to a free text document describing physical or tangible objects, the method comprising the steps of:

(a) automatically segmenting said free text document into a plurality of sentences;

(b) using a computer processor to retrieve a plurality of propositions by matching said sentences in a semantic mapping table created by domain experts through semantically annotating sentences from a corpus of related documents in a knowledge domain to propositions,(c) using a computer processor to retrieve a plurality of codes in a standard lexicon by matching said propositions to said codes created by a third party, in a code mapping table created by domain experts by annotating said propositions to said codes;

wherein one or more of said matching codes from said standard lexicon represents at least a portion of the semantic content of said free text document.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Coding free text documents, especially in medicine, has become an urgent priority as electronic medical records (EMR) mature, and the need to exchange data between EMRs becomes more acute. However, only a few automated coding systems exist, and they can only code a small portion of the free text against a limited number of codes. The precision of these systems is low and code quality is not measured. The present invention discloses a process and system which implements semantic coding against standard lexicon(s) with high precision. The standard lexicon can come from a number of different sources, but is usually developed by a standard'"'"'s body. The system is semi-automated to enable medical coders or others to process free text documents at a rapid rate and with high precision. The system performs the steps of segmenting a document, flagging the need for corrections, validating the document against a data type definition, and looking up both the semantics and standard codes which correspond to the document'"'"'s sentences. The coder has the option to intervene at any step in the process to fix mistakes made by the system. A knowledge base, consisting of propositions, represents the semantic knowledge in the domain. When sentences with unknown semantics are discovered they can be easily added to the knowledge base. The propositions in the knowledge base are associated with codes in the standard lexicon. The quality of each match is rated by a professional who understands the knowledge domain. The system uses this information to perform high precision coding and measure the quality of the match.

Citations

25 Claims

1. A computer implemented method for assigning codes from a standard lexicon to a free text document describing physical or tangible objects, the method comprising the steps of:
- (a) automatically segmenting said free text document into a plurality of sentences;
  
  (b) using a computer processor to retrieve a plurality of propositions by matching said sentences in a semantic mapping table created by domain experts through semantically annotating sentences from a corpus of related documents in a knowledge domain to propositions,(c) using a computer processor to retrieve a plurality of codes in a standard lexicon by matching said propositions to said codes created by a third party, in a code mapping table created by domain experts by annotating said propositions to said codes;
  
  wherein one or more of said matching codes from said standard lexicon represents at least a portion of the semantic content of said free text document.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. The method according to claim 1 further comprising a step of correcting the segmented document prior to step c.
  - 3. The method according to claim 1 further comprising a step of resolution prior to step c.
  - 4. The method according to claim 1 further comprising a step of validation prior to step c.
  - 5. The method according to claim 1, wherein sentences without known propositions determined through step b are displayed and are optionally sent to a knowledge engineer for semantic annotation prior to step c.
  - 6. The method according to claim 1, wherein the matched propositions in step b are displayed, and optionally excluded from step c by the user.
  - 7. The method according to claim 1, wherein the matched codes from the standard lexicon are displayed.
  - 8. The method according to claim 1, wherein sentences without matching codes from the standard lexicon are identified.
  - 9. The method according to claim 1, wherein the semantic match quality of the matched codes from the standard lexicon as determined by a domain expert are displayed.
  - 10. The method according to claim 1, wherein sentences with a match quality other than ‘
    - good’
      
      as determined by a domain expert are displayed.
  - 11. The method according to claim 1, wherein the matched codes from the standard lexicon are stored in a database.
  - 12. The method according to claim 1, wherein the matched propositions are stored in a database.
  - 13. The method according to claim 1, wherein the matched codes from the standard lexicon are added to the document'"'"'s metadata.
  - 14. The method according to claim 1, wherein the matched codes from the standard lexicon are optionally excluded by the user.
  - 15. The method of claim 1, wherein the document is a physician note or report.
  - 16. The method of claim 1, wherein the standard lexicon is selected from the group consisting of:
    - (a) SNOMED CT (b) ICD-9-CM, (c) ICD-10, (d) HCPCS, (e) NDC, (f) CPT, (g) UMLS, (h) LOINC, (i) CDPN, (j) DRG.
  - 17. The method of claim 1, wherein the standard lexicon is either pre-coordinated or compositional.
  - 18. The method of claim 1, wherein the segmentation is completely automatic and optionally modified by the user.

19. A system for assigning codes from a standard lexicon to a free text document, codes, comprising:
- (a) an input queue of documents; and
  
  (b) a segmentation module for processing free text into headers and sentences; and
  
  (c) a proposition look up engine that associates said segmented sentences with a plurality of propositions using a semantic mapping table created by domain experts through a process of semantically annotating sentences from a corpus of related documents in a knowledge domain to propositions; and
  
  (d) a code look up engine that associates said propositions with a plurality of codes from a standard lexicon using a code mapping table created by domain experts through a process of annotating said propositions to said codes.
- View Dependent Claims (20, 21, 22, 23, 24)
- - 20. The system of claim 19, further comprising a module for correcting free text.
  - 21. The system of claim 19, where errors are displayed with a property that provides a visual indication that distinguishes it from normal text such as a different color, font, size, highlighting, underlining, label, or any combination.
  - 22. The system of claim 19, further comprising a module for validating free text against a document type definition.
  - 23. The system of claim 19, where codes from the standard lexicon are added to the document'"'"'s metadata.
  - 24. The system of claim 19, where matched propositions are stored in a database.

25. Computer readable media having computer readable instructions for assigning codes from a standard lexicon to a free text document, the instructions comprising:
- (a) instructions for segmenting said free text document into a plurality of sentences;
  
  (b) instructions for retrieving a plurality of propositions by matching said sentences in a semantic mapping table created by domain experts through-semantically annotating sentences from a corpus of related documents in a knowledge domain to propositions,(c) instructions for retrieving a plurality of codes in a standard lexicon by matching said propositions, to said codes created by a third party in a code mapping table created by domain experts by annotating said propositions to said codes;
  
  wherein one or more of said matching codes from said standard lexicon represents at least a portion of the semantic content of said free text document.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Patrick William Jamieson
Original Assignee
Patrick William Jamieson
Inventors
Jamieson, Patrick William
Primary Examiner(s)
Sked; Matthew J

Application Number

US11/386,996
Time in Patent Office

1,315 Days
Field of Search

None
US Class Current

704/9
CPC Class Codes

G06F 40/131   Fragmentation of text files...

G06Q 40/08   Insurance

G16H 10/60   for patient-specific data, ...

G16H 15/00   ICT specially adapted for m...

Y10S 707/99945   Object-oriented database st...

Y10S 707/99948   Application of database or ...

Process and system for high precision coding of free text documents against a standard lexicon

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Process and system for high precision coding of free text documents against a standard lexicon

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links