System and method for use of semantic understanding in storage, searching, and providing of data or other content information
First Claim
1. A computer-based method for finding text semantically related to a selected text by comparing one or more tuple conceptual graphs (TCG) for the selected text with one or more tuple conceptual graphs (TCGs) for a plurality of candidate similar texts, comprising the steps of:
- receiving, in a memory, an input text;
parsing the input text, with a processor executing instructions stored in a memory, for syntactic linkages;
transforming the parsed input text into one or more linearized conceptual tuple graphs (TCG), said transforming comprising executing an algebraic-based syntactic transformation with a processor in accordance with instructions stored in a memory to generatefor the input text, a first TCG comprising one or more of a first name or other TCG identifier and a first set of linearized tuples, andfor a plurality of candidate semantically related texts, at least one other TCG comprising one or more of a name or other TCG identifier and at least one other set of linearized tuples,wherein each of the first TCG and at least one other TCG comprise stored semantic relationships;
storing each of said first TCG and at least one other TCG in a database;
ordering, with a processor, the first and at least one other set of linearized tuples according to a sort criteria, and folding, with the processor, tuple relationships into a minimal canonical representation by successively examining and merging sorted tuple relationships and resolving arguments upon ties;
comparing, with the processor, the first TCG and at least one other TCG to determine a match, and if a match is found then identifying the first TCG as an equal or partial match of at least one other TCG; and
reporting one or more of the full or the most complete partial matches as similar text.
0 Assignments
0 Petitions
Accused Products
Abstract
A system and method for using semantic understanding in storing and searching data and other information. A linearized tuple-based version of a conceptual graph can be created from a user input. A plurality of conceptual graphs, or portions thereof, can be compared to determine matches. An associative database can be created and/or searched using a hierarchy of conceptual graphs in tuple format, so that the data storage and searching of such database is optimized. The associative database can be used to integrate data from multiple different sources; form part of an Internet or other search engine; or used in other implementations. Also disclosed herein is a system and method for use of semantic understanding in searching and providing of content is described herein. In accordance with an embodiment, the system comprises a Syntactic Parser (SP) or statistical word tokenizer for data retrieval and parsing; a Syntax To Semantics (STS) transformational algebra-based semantic rule set, and an Associative Database (ADB) of linearized tuple conceptual graphs (TCG), utilizing a conceptual graph formalism. Data can be represented within the ADB, enabling both fast data retrieval in the form of semantic objects and a broad ranging taxonomy of content.
6 Citations
21 Claims
-
1. A computer-based method for finding text semantically related to a selected text by comparing one or more tuple conceptual graphs (TCG) for the selected text with one or more tuple conceptual graphs (TCGs) for a plurality of candidate similar texts, comprising the steps of:
-
receiving, in a memory, an input text; parsing the input text, with a processor executing instructions stored in a memory, for syntactic linkages; transforming the parsed input text into one or more linearized conceptual tuple graphs (TCG), said transforming comprising executing an algebraic-based syntactic transformation with a processor in accordance with instructions stored in a memory to generate for the input text, a first TCG comprising one or more of a first name or other TCG identifier and a first set of linearized tuples, and for a plurality of candidate semantically related texts, at least one other TCG comprising one or more of a name or other TCG identifier and at least one other set of linearized tuples, wherein each of the first TCG and at least one other TCG comprise stored semantic relationships; storing each of said first TCG and at least one other TCG in a database; ordering, with a processor, the first and at least one other set of linearized tuples according to a sort criteria, and folding, with the processor, tuple relationships into a minimal canonical representation by successively examining and merging sorted tuple relationships and resolving arguments upon ties; comparing, with the processor, the first TCG and at least one other TCG to determine a match, and if a match is found then identifying the first TCG as an equal or partial match of at least one other TCG; and reporting one or more of the full or the most complete partial matches as similar text. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for a computer to answer a text based question using one or more sources of information, comprising:
-
receiving at the computer from a user input the text based question; parsing into tokens the text based question; semantically interpreting the parsed, tokenized text, creating a plurality of linearized tuples corresponding to the text based question; and comparing the plurality of linearized tuples for the text based question to linearized tuples for semantically interpreted text within a database content according to a tuple conceptual graph (TCG) hierarchy, relation hierarchy, and node hierarchy, to find full or partial matches between the linearized tuples corresponding to the text based question and the database content; whereupon if only finding partial matches between the tuples for the text based question and the tuples for text within the database; performing TCG joins between a plurality of partially matched tuples from the database based either on partial tuple overlap or over any concept node argument to tuple information which comes from different texts, to combine content from the database into a new TCG reflecting new semantic information which is not fully or directly present in any individual textual source or previously stored in the database; and presenting a text representation of the new TCGs as the answer to the question. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A system for use of semantic understanding of a real language request for information, comprising:
-
a syntactic parser, executable on a processor, configured to tokenize words for data retrieval and parsing; a syntax to semantics transformational algebra-based semantic rule set stored in a memory; an associative database of linearized tuple conceptual graphs (TCG) stored in a memory, which represents textual information; a user interface configured to facilitate input requests for information from a user to the system, and the system to pose questions to a user and allow the user to answer questions presented by system; wherein the system, with a processor, semantically interprets the real language request for information using at least one of the following;
link grammar, rules, and algebra-based transformations to transform the request to a semantic rendering or meaning, including creating a plurality of linearized tuples corresponding to the real language request; andcomparing, with a processor, the linearized tuples corresponding to the real language request with the associated database of linearized tuple graphs content according to a TCG hierarchy, relation hierarchy, and node hierarchy, to identify a match with the processor according to the following rules stored in a memory; if a match is found, present to the user, information corresponding to the match from the associated database; if a partial match is found, using the interface, formulate and present a question for the user based on the partial match and request more specific information; and receiving more specific information from the user, repeat the comparison of the tuples and formulate addition questions, if necessary, until a complete match is found; and subsequently present the information corresponding to the match. - View Dependent Claims (13, 14)
-
-
15. A computer based system configured to answer a text based question using one or more sources of information comprising:
-
a user interface configured for input of the text based question and presentation of answers to and from a processor and memory; a syntax to semantics transformational algebra-based semantic rule set stored in the memory; a database of linearized tuple conceptual graphs (TCG) stored in the memory, which represents textual information; a syntactic parser executable on the processor configured to; parse text into tokens and semantically interpret the tokenized text, creating a plurality of linearized tuples corresponding to the text using the semantic rule set; and compare linearized tuples for input text to linearized tuples for semantically interpreted text contained in the associative database according to a linearized tuple conceptual graph (TCG) hierarchy, relation hierarchy, and node hierarchy, to find full or partial matches between the tuples corresponding to the input text and the database content; whereupon the syntactic parser receiving the text based question from the user interface; the question is parsed and tokenized; and the parsed and tokenize text is compared to linearized tuples for semantically interpreted text contained in the database; whereupon only finding partial matches between the linearized tuples for the text based question and the linearized tuples for text within the database; performing TCG joins between the partially matched linearized tuples from the database based either on partial tuple overlap or over any concept node argument to tuple information which comes from different texts, to combine content from the database into a new TCG reflecting new semantic information which is not fully or directly present in any individual textual source or previously stored in the database; and
presenting, in the user interface, a text representation of the new TCGs as the answer to the question. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification