Method and apparatus for semantic pattern matching for text retrieval
First Claim
1. A method for locating information in a natural language text using a computer including a video monitor and a memory device for storing a data structure representing the natural language text,wherein said data structure has a tree structure having nodes and levels, each node representing information from associated locations in said text, each of said locations having a corresponding location identifier, the information in a node being a categorical subset of information represented by a parent node of that node, each node being one level below its parent node in the tree structure and containing node constituents, said parent node and the nodes at higher levels of the tree structure that represent more general categories of information than the information represented by a node being ancestors of that node,the method comprising the steps of:
- formatting a query regarding information in the text into a query data structure;
comparing the query data structure to a node to produce a comparison value representative of the degree of similarity between the query and the information represented by the node;
enhancing the comparison value by an amount representative of the degree of similarity between the query and the information represented by ancestors of the node to produce an enhanced comparison value; and
storing the location identifiers associated with the node if the enhanced comparison value exceeds a predetermined lower bound and the node contributes information responsive to the query.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus utilizing automated semantic pattern recognition whereby a user of a manual enters a query regarding information in the text of the manual and the invention displays the locations in which information responsive to the query is located. The present invention includes a computer that stores a data structure representing the natural language text of a manual, the structure being a tree structure having nodes, wherein each node represents information from associated locations of the text. The nodes at higher levels of the tree represent general categories of information. Nodes at succeedingly lower levels of the tree represent succeedingly more specific categorical subsets of the general categories of information represented at the higher levels by ancestors of the lower level nodes. The query is formatted into a query data structure that is similar in format to the node structure representing the text. The query data structure is compared to each node in the tree. If the degree of similarity between the query and the node exceeds a predetermined threshold value after taking into account the degree of similarity between the ancestors of the node and the query, then the locations associated with the node are displayed on a video monitor as locations containing information responsive to the query.
-
Citations
26 Claims
-
1. A method for locating information in a natural language text using a computer including a video monitor and a memory device for storing a data structure representing the natural language text,
wherein said data structure has a tree structure having nodes and levels, each node representing information from associated locations in said text, each of said locations having a corresponding location identifier, the information in a node being a categorical subset of information represented by a parent node of that node, each node being one level below its parent node in the tree structure and containing node constituents, said parent node and the nodes at higher levels of the tree structure that represent more general categories of information than the information represented by a node being ancestors of that node, the method comprising the steps of: -
formatting a query regarding information in the text into a query data structure; comparing the query data structure to a node to produce a comparison value representative of the degree of similarity between the query and the information represented by the node; enhancing the comparison value by an amount representative of the degree of similarity between the query and the information represented by ancestors of the node to produce an enhanced comparison value; and storing the location identifiers associated with the node if the enhanced comparison value exceeds a predetermined lower bound and the node contributes information responsive to the query. - View Dependent Claims (2)
-
-
3. A method for locating information in a natural language text using a computer including a video monitor and a memory device for storing a data structure representing said natural language text,
wherein said data structure has a tree structure having nodes and levels, each said node representing information from associated locations in said text, each said location having a corresponding location identifier, the information in a node being a categorical subset of information represented by a parent node of that node, each said node being one level below its parent node in said tree structure and containing node constituents, said method comprising the steps of: -
formatting a query regarding information in said text into a query data structure containing constituents of said query, each said query constituent representing a word or a phrase contained in said query; producing a match vector having a plurality of match elements, said match vector being representative of a degree of similarity between said query data structure and one of said nodes, and each said match element corresponding in position to a predetermined query constituent and to a predetermined parent element of a parent vector; merging said match vector with said parent vector to produce a merge vector having a plurality of merge elements, wherein each said merge element corresponds in position to a predetermined parent element by setting each said merge element to be equal to the greater of its corresponding parent element and the match element corresponding to the corresponding parent element; setting a contribution flag to true if any of said match elements is greater than zero and, at the same time, greater than or equal to said parent element corresponding to said match element; if said contribution flag is set to true; averaging said merge elements to produce a merge vector average; storing the location identifiers associated with said node if said merge vector average equals or exceeds a predetermined threshold value, whereby said stored location identifiers represent locations in said natural language text containing information responsive to said query; whether said contribution flag is true or not; performing, at each node in said data structure, said steps of producing a match vector, merging, setting a contribution flag, and, if said contribution flag is set to true, averaging and storing, wherein all said steps are performed at each said node after substituting for said parent vector said merge vector that resulted from performing said merging step on said parent node of said node; and displaying on said video monitor said stored location identifiers. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system for locating information in a natural language text using a computer including a video monitor and a memory device for storing a data structure representing the natural language text,
wherein said data structure has a tree structure having nodes and levels, each node representing information from associated locations in said text, each of said locations having a corresponding location identifier, the information in each node being a categorical subset of information represented by a parent node of each node, each node being one level below its parent node in the tree structure and containing node constituents, said parent node and the nodes at higher levels of the tree structure that represent more general categories of information than the information represented by a node being ancestors of that node, the system comprising: -
query formatting means for formatting a query regarding information in the text into a query data structure; comparison means for comparing the query data structure to a node to produce a comparison value representative of the degree of similarity between the query and the information represented by the node; enhancing means for enhancing the comparison value by an amount representative of the degree of similarity between the query and the information represented by ancestors of the node to produce an enhanced comparison value; storing means for storing the location identifiers associated with the node if the enhanced comparison value exceeds a predetermined lower bound and the node contributes information responsive to the query; and display means for displaying the stored location identifiers on the video monitor.
-
-
17. A system for locating information in a natural language text using a computer including a video monitor and a memory device for storing a data structure representing said natural language text,
wherein said data structure has a tree structure having nodes and levels, each said node representing information from associated locations in said text, each of said locations having a corresponding location identifier, the information in a node being a categorical subset of information represented by a parent node of that node, each said node being one level below its parent node in said tree structure and containing node constituents, said system comprising: -
query formatting means for formatting a query regarding information in said text into a query data structure containing constituents of said query, each said query constituent representing a word or a phrase contained in said query; matching means for producing a match vector having a plurality of match elements, said match vector being representative of a degree of similarity between said query data structure and one of said nodes, and each said match element corresponding in position to a predetermined query constituent and to a predetermined parent element of a parent vector; merging means for merging said match vector with said parent vector to produce a merge vector having a plurality of merge elements, wherein each said merge element corresponds in position to a predetermined parent element, by setting each said merge element to be equal to the greater of its corresponding parent element and the match element corresponding to the corresponding parent element; contribution detecting means for setting a contribution flag to true if any of said match elements is greater than zero and, at the same time, greater than or equal to said parent element corresponding to said match element; contribution testing means for determining whether said contribution flag is set to true; merge averaging means responsive to said contribution testing means for averaging said merge elements to produce a merge vector average; threshold testing means for determining whether said merge vector average equals or exceeds a predetermined threshold value; storing means responsive to said threshold testing means for storing the location identifiers associated with said node if said merge vector average equals or exceeds said predetermined threshold value, whereby stored location identifiers represent locations in said natural language text containing information responsive to said query; substitution means for substituting said merge vector for said parent vector; and display means for displaying on said video monitor said stored location identifiers. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification