Concept matching system
First Claim
1. A system for retrieving documents related to a concept from a text corpus comprising:
- a memory which stores;
a set of semantic classes which are combinable to express the concept, a set of keywords for each of the semantic classes to be used in searching documents in the text corpus, each set of keywords including at least one keyword, and a plurality of syntactic rules to be applied to identified text portions which include one or more of the keywords, each of the syntactic rules identifying a first of the semantic classes and a second of the semantic classes, the rule being satisfied when a keyword from the first of the semantic classes and a keyword from the second of the semantic classes are in any one of a plurality of syntactic relationships; and
a concept matching module, which accesses the memory, for identifying text portions within the text corpus which include one or more of the keywords and for applying the syntactic rules to the text portions and identifying those text portions which satisfy at least one of the syntactic rules, whereby documents are retrieved which include at least one of the identified text portions.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for retrieving documents related to a concept from a text corpus includes a set of stored semantic classes which are combinable to express the concept each class including a set of keywords, each set of keywords including at least one keyword. Syntactic rules are applied to identified text portions which include one or more of the keywords. A rule is satisfied when keywords from the first and second semantic classes are in any one of a plurality of syntactic relationships. A concept matching module identifies text portions within the text corpus which include one or more of the keywords, for applying the syntactic rules to the text portions, and for identifying those text portions which satisfy at least one of the rules. Documents to be retrieved may include at least one of the identified text portions.
101 Citations
22 Claims
-
1. A system for retrieving documents related to a concept from a text corpus comprising:
-
a memory which stores;
a set of semantic classes which are combinable to express the concept, a set of keywords for each of the semantic classes to be used in searching documents in the text corpus, each set of keywords including at least one keyword, and a plurality of syntactic rules to be applied to identified text portions which include one or more of the keywords, each of the syntactic rules identifying a first of the semantic classes and a second of the semantic classes, the rule being satisfied when a keyword from the first of the semantic classes and a keyword from the second of the semantic classes are in any one of a plurality of syntactic relationships; and
a concept matching module, which accesses the memory, for identifying text portions within the text corpus which include one or more of the keywords and for applying the syntactic rules to the text portions and identifying those text portions which satisfy at least one of the syntactic rules, whereby documents are retrieved which include at least one of the identified text portions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer implemented system for retrieving documents related to a concept from a text corpus comprising:
-
a component which labels selected keywords in the text corpus;
a component which associates the labeled keywords with a semantic class, each of the semantic classes including a plurality of the keywords;
a component which labels pairs of keywords of selected semantic classes which are in any one of a plurality of syntactic relationships; and
a component which identifies documents which include a labeled pair of keywords.
-
-
19. A method for retrieving documents related to a concept from a text corpus comprising:
-
labeling keywords in the text corpus which belong to at least one of a plurality of predefined semantic classes;
labeling pairs of labeled keywords which are in any one of a plurality of syntactic relationships and which meet one of a plurality of predefined syntactic rules; and
labeling documents which include at least one labeled pair; and
retrieving at least a portion of the documents which include at least one labeled pair. - View Dependent Claims (20, 21, 22)
-
Specification