Creation of structured data from plain text
First Claim
Patent Images
1. A computerized method comprising:
- tokenizing a plain text description;
creating parse trees from the tokenized plain text description based on grammar from a grammar storage area;
generating an instance tree from each parse tree based upon an application domain specific natural markup language provided by a natural markup language model module;
discarding each invalid or incomplete instance tree;
choosing an instance tree from remaining instance trees representing a best map based upon a cost function;
processing the best map with a domain markup language generator to generate a structured data representation.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and system for converting plain text into structured data. Parse trees for the plain text are generated based on the grammar of a natural language, the parse trees are mapped on to instance trees generated based on an application-specific model. The best map is chosen, and the instance tree is passing to an application for execution. The method and system can be used both for populating a database and/or for retrieving data from a database based on a query.
90 Citations
14 Claims
-
1. A computerized method comprising:
-
tokenizing a plain text description; creating parse trees from the tokenized plain text description based on grammar from a grammar storage area; generating an instance tree from each parse tree based upon an application domain specific natural markup language provided by a natural markup language model module; discarding each invalid or incomplete instance tree; choosing an instance tree from remaining instance trees representing a best map based upon a cost function; processing the best map with a domain markup language generator to generate a structured data representation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented system, comprising:
-
a parser to create all parse trees from a tokenized plain text description based on grammar from a grammar storage area; a mapper to (A) generate an instance tree from each parse tree based upon an application domain specific natural markup language provided by a natural markup language model module, (B) prune the instance trees, and (C) choose an instance tree from remaining instance trees representing a best map based on a cost function; a domain markup language generator to process the best map to generate a domain markup language document. - View Dependent Claims (13, 14)
-
Specification