Mere-parsing with boundary and semantic driven scoping
First Claim
Patent Images
1. A method for building a tree of parse items, the method comprising the steps of:
- receiving a plurality of parse items stored in an ordered data structure;
processing semantic attributes associated with the plurality of parse items;
generating a merged parse item from at least two parse items of the plurality of parse items, wherein generating comprises;
selecting a first parse item of the plurality of parse items;
selecting a second parse item of the plurality of parse items;
comparing the first parse item against a predetermined set of target semantic data definitions to obtain a first semantic match;
comparing the second parse item against the predetermined set to obtain a second semantic match;
comparing a combined text of the first and second parse items against the predetermined set to obtain a merged semantic match;
determining whether the merged semantic match is better than either the first or second semantic matches by testing semantic goodness; and
merging the first and second parse items responsive to determining that the merged semantic match is better; and
forming a portion of a tree data structure such that the merged parse item is a parent of the at least two parse items.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for building a tree of parse items involves receiving a plurality of parse items stored in an ordered data structure, processing semantic attributes associated with the plurality of parse items, generating a merged parse item from at least two parse items of the plurality of parse items, and forming a portion of a tree data structure such that the merged parse item is a parent of the at least two parse items.
-
Citations
9 Claims
-
1. A method for building a tree of parse items, the method comprising the steps of:
-
receiving a plurality of parse items stored in an ordered data structure; processing semantic attributes associated with the plurality of parse items; generating a merged parse item from at least two parse items of the plurality of parse items, wherein generating comprises; selecting a first parse item of the plurality of parse items; selecting a second parse item of the plurality of parse items; comparing the first parse item against a predetermined set of target semantic data definitions to obtain a first semantic match; comparing the second parse item against the predetermined set to obtain a second semantic match; comparing a combined text of the first and second parse items against the predetermined set to obtain a merged semantic match; determining whether the merged semantic match is better than either the first or second semantic matches by testing semantic goodness; and merging the first and second parse items responsive to determining that the merged semantic match is better; and forming a portion of a tree data structure such that the merged parse item is a parent of the at least two parse items. - View Dependent Claims (2, 3)
-
-
4. A computer-implemented mere-parser system comprising:
-
a semantic test application configured to test a semantic goodness of a merged semantic match; a semantic data storage configured to store semantic data received from the semantic test application; a source data analysis unit comprising a mere-parser application configured to normalize source data, wherein the mere-parser application comprises a parse item bounding system that comprises instructions that when executed cause a processor to; normalize formatting of the source data; perform morphological processing on the source data; normalize special phrases of the source data; identify boundaries and form parse items from the source data; and identify propagating attributes of the source data; a source data storage configured to store source data; a first bi-directional communication link that couples the semantic test application with the semantic data storage; a second bi-directional communication link that couples the semantic test application with the source data analysis unit; and a third bi-directional communication link that couples the source data analysis unit with the source data storage. - View Dependent Claims (5, 6, 7, 8, 9)
-
Specification