System for identifying textual relationships
First Claim
1. A computer-implemented method for identifying textual statement relationships, the method comprising:
- identifying a textual statement pair that includes a first textual statement and a second textual statement, the first textual statement comprising a first set of words and the second textual statement comprising a second set of words;
removing, by a pre-processing module, non-alpha numeric characters from the first textual statement and the second textual statement;
communicating, by the pre-processing module, the pre-processed first textual statement and second textual statement to a processor;
extracting, by the processor, a first parsed word group from the first textual statement and a second parsed word group from the second textual statement, wherein each parsed word group is a verb-object-preposition (VOP) triple including a verb, an object, and a preposition from each respective textual statement;
comparing, for the textual statement pair, the first parsed word group and the second parsed word group; and
calculating, through the use of the processor, a parsed word score for the textual statement pair, wherein the parsed word score is based on the comparison of the first parsed word group and the second parsed word group;
determining a match score for the textual statement pair based on the parsed word score wherein calculating the parsed word score for the textual statement pair comprises;
extracting, through the use of the processor, a parsed word group pair from the textual statement pair, wherein the parsed word group pair includes a plurality of term pairs, the plurality of term pairs including a verb pair comprising a verb from the VOP triple for the first word group and a verb from the VOP triple for the second word group, an object pair comprising an object from the VOP triple for the first word group and an object from the VOP triple for the second word group, and a preposition pair comprising a preposition from the VOP triple for the first word group and a preposition from the VOP triple for the second word group;
calculating a verb pair sub-score, an object pair sub-score, and a preposition pair sub-score, the calculation of each pair sub-score based on a string similarity, a semantic similarity, and a lexicon similarity between each verb, object, or preposition of the respective verb pair, object pair, or preposition pair; and
wherein the parsed word score is the product of at least one of the verb pair sub-score, the object pair sub-score, and the preposition pair sub-score;
generating, by the processor, a user interface configured to depict one or more first textual statements and one or more second textual statements along with one or more match indicators that visually indicate a match between one or more of the first textual statements and one or more of the second textual statements;
communicating, by a graphics processor in communication with the processor and a display the generated user interface to thereby cause the display to visually display the generated user interface.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer-implemented method identifies textual statement relationships. Textual statement pairs including a first and second textual statement are identified, and parsed word group pairs are extracted from first and second textual statements. The parsed word groups are compared, and a parsed word score for each statement pair is calculated. Word vectors for the first and second textual statements are created and compared. A word vector score is calculated based on the comparison of the word vectors for the first and second textual statements. A match score is determined for the textual statement pair, with the match score being representative of at least one of the parsed word score and the word vector score.
80 Citations
29 Claims
-
1. A computer-implemented method for identifying textual statement relationships, the method comprising:
-
identifying a textual statement pair that includes a first textual statement and a second textual statement, the first textual statement comprising a first set of words and the second textual statement comprising a second set of words; removing, by a pre-processing module, non-alpha numeric characters from the first textual statement and the second textual statement; communicating, by the pre-processing module, the pre-processed first textual statement and second textual statement to a processor; extracting, by the processor, a first parsed word group from the first textual statement and a second parsed word group from the second textual statement, wherein each parsed word group is a verb-object-preposition (VOP) triple including a verb, an object, and a preposition from each respective textual statement; comparing, for the textual statement pair, the first parsed word group and the second parsed word group; and calculating, through the use of the processor, a parsed word score for the textual statement pair, wherein the parsed word score is based on the comparison of the first parsed word group and the second parsed word group; determining a match score for the textual statement pair based on the parsed word score wherein calculating the parsed word score for the textual statement pair comprises; extracting, through the use of the processor, a parsed word group pair from the textual statement pair, wherein the parsed word group pair includes a plurality of term pairs, the plurality of term pairs including a verb pair comprising a verb from the VOP triple for the first word group and a verb from the VOP triple for the second word group, an object pair comprising an object from the VOP triple for the first word group and an object from the VOP triple for the second word group, and a preposition pair comprising a preposition from the VOP triple for the first word group and a preposition from the VOP triple for the second word group; calculating a verb pair sub-score, an object pair sub-score, and a preposition pair sub-score, the calculation of each pair sub-score based on a string similarity, a semantic similarity, and a lexicon similarity between each verb, object, or preposition of the respective verb pair, object pair, or preposition pair; and wherein the parsed word score is the product of at least one of the verb pair sub-score, the object pair sub-score, and the preposition pair sub-score; generating, by the processor, a user interface configured to depict one or more first textual statements and one or more second textual statements along with one or more match indicators that visually indicate a match between one or more of the first textual statements and one or more of the second textual statements; communicating, by a graphics processor in communication with the processor and a display the generated user interface to thereby cause the display to visually display the generated user interface. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for textual statement relationship identification, the system comprising:
-
a database configured to store a first set of textual statements and a second set of textual statements; a text analysis system comprising a pre-processing module, a processor, non-transitory computer readable storage medium, and a graphics processor wherein; the pre-processing module is configured to remove non-alpha numeric characters from one or more textual statements; the processor is in communication with the pre-processing module and is configured to receive pre-processed textual statements; the non-transitory computer readable storage medium has stored therein data instructions executable by the processor to cause the processor to perform acts of; identifying a textual statement pair that includes a first textual statement and a second textual statement, the first textual statement comprising a first set of words and the second textual statement comprising a second set of words; extracting a parsed word group pair from the textual statement pair, where the parsed word group pair includes a first parsed word group from the first textual statement and a second parsed word group from the second textual statement, wherein the parsed word group pair includes a plurality of term pairs, the plurality of term pairs including a verb pair comprising a verb from the first word group and a verb from the second word group, an object pair comprising an object from the first word group and an object from the second word group, and a preposition pair comprising a preposition from the first word group and a preposition from the second word group; comparing, for the textual statement pair, the first parsed word group and the second parsed word group; calculating a verb pair sub-score, an object pair sub-score, and a preposition pair sub-score based on comparison of the textual statement pair, the first parsed word group and the second parsed word group; calculating, a parsed word score for the textual statement pair, wherein the parsed word score is based on comparison of the first parsed word group and the second parsed word group wherein the parsed word score is a product of at least one of the verb pair sub-score, the object pair sub-score, and the preposition pair sub-score; determining a match score for the textual statement pair based on the parsed word score; and generating a user interface configured to depict one or more first textual statements and one or more second textual statements along with one or more match indicators that visually indicate a match between one or more of the first textual statements and one or more of the second textual statements; communicating the generated user interface to the graphics processor; and the graphics processor is in communicating with a display to thereby cause the display to visually display the generated user interface. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A processor configured to calculate a match score for a textual statement pair, the processor comprising:
-
pre-processing hardware logic configured to remove non-alpha numeric characters from a first textual statement and a second textual statement of the textual statement pair, and to communicate the pre-processed textual statement pair to term extractor circuitry; term extractor circuitry configured to receive the pre-processed textual statement pair and to extract a parsed word group pair from the pre-processed textual statement pair, the parsed word group pair including a plurality of term pairs, wherein the term pairs include a verb pair comprising a verb from each textual statement from the textual statement pair, an object pair comprising an object from the each textual statement, and a preposition pair comprising a preposition from the each textual statement; a parsed term matcher circuitry including; a string matcher circuitry configured to calculate a string similarity score for each term pair; a semantic matcher circuitry configured to calculate a semantic similarity score for each term pair; a lexicon matcher circuitry configured to calculate a lexicon similarity score for each term pair; and wherein the parsed term matcher circuitry is configured to calculate a sub-score for each verb pair, object pair, and preposition pair based on at least one of the corresponding string similarity score, the corresponding semantic similarity score, and the corresponding lexicon similarity score; wherein the parsed term matcher circuitry is configured to calculate a parsed word score for the textual statement pair based on at least one of the verb pair sub-score, the object pair sub-score, and the preposition pair sub-score; and user interface circuitry configured to generate a user interface that depicts one or more first textual statements and one or more second textual statements along with one or more match indicators that visually indicate a match between one or more of the first textual statements and one or more of the second textual statements, and a graphics processor in communication with the process configured to communicate the generated user interface to a display to thereby cause the display to visually display the generated user interface. - View Dependent Claims (25, 26, 27, 28, 29)
-
Specification