Specification-based automation methods for medical content extraction, data aggregation and enrichment
First Claim
Patent Images
1. A method for structuring diverse medical data as a database comprising:
- extracting content from the diverse medical data comprising;
generating Extensible Markup Language (XML)-based extraction rules for each type of medical data based on predetermined information from the medical data document type;
converting each medical data document type into one or more standard document types, wherein each standard document type includes text and image object relevant content;
extracting the relevant content from each standard document type using the XML-based extraction rules associated with the medical data document type as an XML syntax;
indexing the extracted relevant content; and
storing the indexed relevant content;
aggregating the stored indexed relevant content comprising;
creating data migration language extraction rules;
extracting the stored indexed relevant content using the data extraction rules;
creating data migration language constraint differential rules;
normalizing the extracted indexed relevant content using the constraint differential rules;
storing the normalized relevant content;
creating data migration language record/rectify merge rules;
consolidating the normalized relevant content using the record/rectify merge, rules; and
storing the consolidated relevant content;
enriching the stored consolidated relevant content comprising;
creating one or more XML templates; and
sequentially processing the consolidated relevant content using the one or more XML templates into models.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for knowledge generation from raw medical records uses XML-based specifications. The method includes content extraction, data aggregation and data enrichment. The method operates on various sources of medical data including financial data, clinical documents and medical images.
29 Citations
20 Claims
-
1. A method for structuring diverse medical data as a database comprising:
-
extracting content from the diverse medical data comprising; generating Extensible Markup Language (XML)-based extraction rules for each type of medical data based on predetermined information from the medical data document type; converting each medical data document type into one or more standard document types, wherein each standard document type includes text and image object relevant content; extracting the relevant content from each standard document type using the XML-based extraction rules associated with the medical data document type as an XML syntax; indexing the extracted relevant content; and storing the indexed relevant content; aggregating the stored indexed relevant content comprising; creating data migration language extraction rules; extracting the stored indexed relevant content using the data extraction rules; creating data migration language constraint differential rules; normalizing the extracted indexed relevant content using the constraint differential rules; storing the normalized relevant content; creating data migration language record/rectify merge rules; consolidating the normalized relevant content using the record/rectify merge, rules; and storing the consolidated relevant content; enriching the stored consolidated relevant content comprising; creating one or more XML templates; and sequentially processing the consolidated relevant content using the one or more XML templates into models. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer program product comprising a computer readable recording medium having recorded thereon a computer program comprising code means for, when executed on a computer, instructing the computer to control steps in a method for structuring diverse medical data as a database comprising:
-
extracting content from the diverse medical data comprising; generating Extensible Markup Language (XML)-based extraction rules for each type of medical data based on predetermined information from the medical data document type; converting each medical data document type into one or more standard document types, wherein each standard document type includes text and image object relevant content; extracting the relevant content from each standard document type using the XML-based extraction rules associated with the medical data document type as an XML syntax; indexing the extracted relevant content; and storing the indexed relevant content; aggregating the stored indexed relevant content comprising; creating data migration language extraction rules; extracting the stored indexed relevant content using the data extraction rules; creating data migration language constraint differential rules; normalizing the extracted indexed relevant content using the constraint differential rules; storing the normalized relevant content; creating data migration language record/rectify merge rules; consolidating the normalized relevant content using the record/rectify merge rules; and storing the consolidated relevant content; enriching the stored consolidated relevant content comprising; creating one or more XML templates; and sequentially processing the consolidated relevant content using the one or more XML templates into models. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification