System method and computer program product for obtaining structured data from text
First Claim
1. A computer-readable software system for effecting on a computer an automatic case acquisition system using a text transformation system, the system comprising:
- a case acquisition system;
a case based reasoning engine; and
a model editor interface;
wherein the case acquisition system comprises a memory device having embodied therein, data related to transforming text into structure data, and a processor in communication with said memory device, said processor configured to cause the system to input text to be transformed into structured data, define at least one criterion relating to content of said text, apply a first matching method to said text to identify a value in said text for each of said criterion, apply a second matching method to said text to identify a value in said text for each of said criterion that said first matching method fails to identify a value for, associate each of said criterion with a value obtained by said first or second matching method to create at least one criterion-value pair, and create structure data based on said at least one criterion-value pair, where the first matching method has a higher reliability than said second matching method;
wherein the case based reasoning engine comprises a structured advisory system; and
wherein the model editor interface comprises an interface that allows a system administration to add at least a relation and at least a similarity information usable to the case based reasoning engine.
2 Assignments
0 Petitions
Accused Products
Abstract
A method, system, and computer program product for obtaining structured data from text includes inputting text to be transformed into structured data, defining at least one criterion relating to content of the text, applying a first matching method to the text to identify a value in the text for each of the criterion, and applying a second matching method to the text to identify a value in the text for each of the criterion that the first matching method fails to identify a value for. Each of the criterion is then associated with a value obtained by the first or second matching method to create at least one criterion-value pair, and structured data is created based on the at least one criterion-value pair. The first matching method has a higher reliability than the second matching method.
71 Citations
17 Claims
-
1. A computer-readable software system for effecting on a computer an automatic case acquisition system using a text transformation system, the system comprising:
-
a case acquisition system; a case based reasoning engine; and a model editor interface; wherein the case acquisition system comprises a memory device having embodied therein, data related to transforming text into structure data, and a processor in communication with said memory device, said processor configured to cause the system to input text to be transformed into structured data, define at least one criterion relating to content of said text, apply a first matching method to said text to identify a value in said text for each of said criterion, apply a second matching method to said text to identify a value in said text for each of said criterion that said first matching method fails to identify a value for, associate each of said criterion with a value obtained by said first or second matching method to create at least one criterion-value pair, and create structure data based on said at least one criterion-value pair, where the first matching method has a higher reliability than said second matching method; wherein the case based reasoning engine comprises a structured advisory system; and wherein the model editor interface comprises an interface that allows a system administration to add at least a relation and at least a similarity information usable to the case based reasoning engine. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system for generating cases to be used by a cased based reasoning engine, the system comprising a memory device having embodied therein, data related to generating cases used by a cased based reasoning engine, and a processor in communication with the memory device, the processor configured to cause the system to comprise:
-
a case acquisition system; a case based reasoning engine; a model editor interface; at least a bulk file of unstructured records; a structured domain model and lexical script; a validation set of cases created by the case acquisition system to be validated by a user to be added to the structured cases; and structured cases created by the case acquisition system; wherein the case acquisition system is a text transformation system able to transform bulk files of unstructured records into data to be processed ultimately by the case base reasoning engine, the model editor interface comprises an interface that allows a system administration to add at least a relation and at least a similarity information usable to the case based reasoning engine, the model editor interface taking input from the structured cases or the structured domain model and lexical script.
-
-
17. A computer-readable software system embodied on a computer-readable medium for generating cases to be used by a cased based reasoning engine associated with a degree of confidence, the system comprising:
-
at least a bulk file of unstructured records; a case acquisition system able to transform bulk files of unstructured records into either a validation set of cases or a structured set of cases; a case based reasoning engine able to process the structured set of cases; wherein the validation set of cases is validated by a user and then added to the structured set of cases, and where the case acquisition system generates cases associated with a degree of confidence function of a degree of confidence associated with the recognized text values at a threshold, where cases with a degree of confidence lower than the threshold are sent to the validation set and the cases associated with a degree of confidence higher than the threshold are sent to the structured set of cases.
-
Specification