Semantic matching using predicate-argument structure
First Claim
Patent Images
1. A system that processes text intervals, comprising:
- a memory; and
a processor configured to execute a plurality of modules stored in the memory;
the modules including;
a preprocessing module configured to;
extract a first proposition from a first text interval;
a generation module configured to;
provide a plurality of semantic roles, wherein each role of the plurality of roles defines a different semantic relationship between at least two words,generate a first proposition tree from the first proposition, wherein the first proposition tree comprises at least one node connected to other nodes by at least one edge, wherein each node is respectively associated with at least one word from the first proposition,assign at least one of the plurality of roles to the at least one edge; and
a matching module configured to;
determine a first similarity value between the first text interval and a second text interval based on a comparison of the first proposition tree and a second proposition tree corresponding to the second text interval, wherein the second text interval is different from the first text interval and at least one of the first text interval and the second text interval comprises natural language, andselectively output the second text interval based on the first similarity value.
2 Assignments
0 Petitions
Accused Products
Abstract
The invention relates to topic classification systems in which text intervals are represented as proposition trees. Free-text queries and candidate responses are transformed into proposition trees, and a particular candidate response can be matched to a free-text query by transforming the proposition trees of the free-text query into the proposition trees of the candidate responses. Because proposition trees are able to capture semantic information of text intervals, the topic classification system accounts for the relative importance of topic words, for paraphrases and re-wordings, and for omissions and additions. Redundancy of two text intervals can also be identified.
-
Citations
27 Claims
-
1. A system that processes text intervals, comprising:
-
a memory; and a processor configured to execute a plurality of modules stored in the memory; the modules including; a preprocessing module configured to; extract a first proposition from a first text interval; a generation module configured to; provide a plurality of semantic roles, wherein each role of the plurality of roles defines a different semantic relationship between at least two words, generate a first proposition tree from the first proposition, wherein the first proposition tree comprises at least one node connected to other nodes by at least one edge, wherein each node is respectively associated with at least one word from the first proposition, assign at least one of the plurality of roles to the at least one edge; and a matching module configured to; determine a first similarity value between the first text interval and a second text interval based on a comparison of the first proposition tree and a second proposition tree corresponding to the second text interval, wherein the second text interval is different from the first text interval and at least one of the first text interval and the second text interval comprises natural language, and selectively output the second text interval based on the first similarity value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method of processing text intervals, comprising:
-
extracting a first proposition from a first text interval; providing a plurality of semantic roles, wherein each role of the plurality of roles defines a different semantic relationship between at least two words; generating a first proposition tree from the first proposition, wherein the first proposition tree comprises at least one node connected to other nodes by at least one edge, wherein each node is respectively associated with at least one word from the first proposition; assigning at least one of the plurality of roles to the at least one edge; determining a first similarity value between the first text interval and a second text interval based on a comparison of the first proposition tree and a second proposition tree corresponding to the second text interval, wherein the second text interval is different from the first text interval and at least one of the first text interval and the second text interval comprises natural language; and selectively outputting, using a processor, the second text interval based the first similarity value. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification