Creation of structured data from plain text
0 Assignments
0 Petitions
Accused Products
Abstract
A method and system for converting plain text into structured data. Parse trees for the plain text are generated based on the grammar of a natural language, the parse trees are mapped on to instance trees generated based on an application-specific model. The best map is chosen, and the instance tree is passing to an application for execution. The method and system can be used both for populating a database and/or for retrieving data from a database based on a query.
44 Citations
42 Claims
-
1-22. -22. (canceled)
-
23. A system, comprising:
-
a processor configured to; tokenize a plain text description; create parse trees from the tokenized plain text description based on grammar from a grammar storage area; generate an instance tree from each parse tree based upon an application domain specific natural markup language provided by a natural markup language model module; discard each invalid or incomplete instance tree; choose an instance tree from remaining instance trees representing a best map based upon a cost function; and process the best map with a domain markup language generator to generate a structured data representation; and a memory coupled to the processor and configured to provide the processor with instructions. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A computer program product embodied in a computer readable medium and comprising computer instructions for:
-
tokenizing a plain text description; creating parse trees from the tokenized plain text description based on grammar from a grammar storage area; generating an instance tree from each parse tree based upon an application domain specific natural markup language provided by a natural markup language model module; discarding each invalid or incomplete instance tree; choosing an instance tree from remaining instance trees representing a best map based upon a cost function; and processing the best map with a domain markup language generator to generate a structured data representation. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42)
-
Specification