Free format query processing in an information search and retrieval system
First Claim
1. A method for processing terns of an input query, said method comprising the steps of:
- receiving an input query comprising a plurality of terms;
storing a knowledge base comprising a plurality of categories, wherein a plurality of subsets of said categories are designated dimensional categories;
processing said terms of said input query to identify value terms that comprise content carrying capacity;
referencing said knowledge base to identify a dimensional category for each value term;
generating a processed input query comprising, as a logical connector between two value terms, an AND if two respective value terms are associated with two different dimensional categories, and generating an OR if two respective value terms are associated with the same dimensional category.
2 Assignments
0 Petitions
Accused Products
Abstract
A search and retrieval system pre-processes an input query to map a contextual semantic interpretation, expressed by the user of the input query, to a boolean logic interpretation for processing in the search and retrieval system. A knowledge base comprises a plurality of categories, such that subsets of the categories are designated to one of a plurality of groups. A lexicon stores a plurality of terms including definitional characteristics for the terms. To pre-process the query, the search and retrieval system receives an input query comprising a plurality of terms, and processes the terms by referencing the lexicon to identify value terms that comprise a content carrying capacity. The knowledge base is referenced to identify a group for each value term. A processed input query is generated by inserting an AND logical connector between two value terms if the two respective value terms are in different groups and by inserting an OR logical connector between two value terms if the two respective value terms are in the same group. The lexicon is also used to identify phrases as well as connective terms for conversion to a boolean operator.
139 Citations
17 Claims
-
1. A method for processing terns of an input query, said method comprising the steps of:
-
receiving an input query comprising a plurality of terms;
storing a knowledge base comprising a plurality of categories, wherein a plurality of subsets of said categories are designated dimensional categories;
processing said terms of said input query to identify value terms that comprise content carrying capacity;
referencing said knowledge base to identify a dimensional category for each value term;
generating a processed input query comprising, as a logical connector between two value terms, an AND if two respective value terms are associated with two different dimensional categories, and generating an OR if two respective value terms are associated with the same dimensional category. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
storing a plurality of phrases;
referencing said phrases to identify a plurality of successive input query terms as one of said phrases stored; and
processing 5 aid phrase identified as a single value term.
-
-
3. The method as set forth in claim 1, wherein the step of processing said terms of said input query to identify value terms comprises the steps of:
-
storing a lexicon comprising a plurality of terms that identifies a part of speech for a respective term;
accessing said lexicon to reference each query input term; and
selecting, as value terms, those terms that carry content.
-
-
4. The method as set forth in claim 1, further comprising the steps of:
-
storing a lexicon comprising a plurality of terms, wherein said lexicon identifies of a set of terms, designated as prepositions, as AND preposition terms;
referencing said lexicon to identify an input query term as an AND preposition term; and
generating, in said processed input query, an AND logical boolean connector in lieu of said input query term if an input query term comprises an AND preposition term.
-
-
5. The method as set forth in claim 1, further comprising the steps of:
-
storing a lexicon comprising a plurality of terns, wherein said lexicon identifies of a set of tennis, designated as conjunctions, as AND conjunction terms;
referencing said lexicon to identify an input query term as an AND conjunction term; and
generating, in said processed input query, an AND logical boolean connector in lieu of said input query term if an input query term comprises an AND conjunction term.
-
-
6. The method as set forth in claim 1, further comprising the steps of:
-
storing a lexicon comprising a pluxality of terms, wherein said lexicon identifies of a set of terms, designated as conjunctions, as OR conjunction terms;
referencing said lexicon to identify an input query term as an OR conjunction term; and
generating, in said processed input query, an OR logical boolean connector in lieu of said input query term if an input query term comprises an OR conjunction term.
-
-
7. The method as set forth in claim 1, further comprising the steps of:
-
storing a lexicon comprising a plurality of terms, wherein said lexicon identifies of a set of terms, designated as conjunctions, as NOT conjunction terms;
referencing said lexicon to identify an input query term as a NOT conjunction term; and
generating, in said processed input query, a NOT logical boolean connector in lieu of said input query term if an input query term comprises a NOT conjunction term.
-
-
8. The method as set forth in claim 1, wherein the step of storing a knowledge base comprising a plurality categories designated into groups comprises the step of storing a knowledge base wherein a subset of said categories comprises a plurality of dimensional categories, such that each dimensional category represents discrete and independent concepts from other dimensional categories and each dimensional category represents one of said groups.
-
9. A computer readable medium comprising a plurality of instructions which when executed causes the computer to perform the steps of:
-
receiving an input query comprising a plurality of terms;
storing a knowledge base comprising a plurality of categories, wherein a plurality of subsets of said categories are designated dimensional categories;
processing said terms of said input query to identify value terms that comprise content carrying capacity;
referencing said knowledge base to identify a dimensional category for each value term;
generating a processed input query comprising, as a logical connector between two value terms, an AND if two respective value terms are associated with two different dimensional categories, and generating an OR if two respective value terms are associated with the same dimensional category. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
storing a plurality of phrases;
referencing said phrases to identify a plurality of successive input query terms as one of said phrases stored; and
processing said phrase identified as a single value term.
-
-
11. The computer readable medium as set forth in claim 9, wherein the step of processing said terms of said input query to identify value terms comprises the steps of:
-
storing a lexicon comprising a plurality of terms that identifies a part of speech for a respective term;
accessing said lexicon to reference each query input term; and
selecting, as value terms, those terms that carry content.
-
-
12. The computer readable medium as set forth in claim 9, further comprising the steps of:
-
storing a lexicon comprising a plurality of terms, wherein said lexicon identifies of a set of terms, designated as prepositions, as AND preposition terms;
referencing said lexicon to identify an input query term as an AND preposition term; and
generating, in said processed input query, an AND logical boolean connector in lieu of said input query term if an input query term comprises an AND preposition term.
-
-
13. The computer readable medium as set forth in claim 9, further comprising the steps of:
-
storing a lexicon comprising a plurality of terms, wherein said lexicon identifies of a set of terms, designated as conjunctions, as AND conjunction terms;
referencing said lexicon to identify an input query term as an AND conjunction term; and
generating, in said processed input query, an AND logical boolean connector in lieu of said input query term if an input query term comprises an AND conjunction term.
-
-
14. The computer readable medium as set forth in claim 9, further comprising the steps of:
-
storing a lexicon comprising a plurality of terms, wherein said lexicon identifies of a set of terms, designated as conjunctions, as OR conjunction terms;
referencing said lexicon to identify an input query term as an OR conjunction term; and
generating, in said processed input query, an OR logical boolean connector in lieu of said input query term if an input query term comprises an OR conjunction term.
-
-
15. The computer readable medium as set forth in claim 9, further comprising the steps of:
-
storing a lexicon comprising a plurality of terms, wherein said lexicon identifies of a set of terms, designated as conjunctions, as NOT conjunction terms;
referencing said lexicon to identify an input query term as a NOT conjunction term; and
generating, in said processed input query, a NOT logical boolean connector in lieu of said input query term if an input query term comprises a NOT conjunction term.
-
-
16. The computer readable medium as set forth in claim 9, wherein the step of storing a knowledge base comprising a plurality categories designated into groups comprises the step of storing a knowledge base wherein a subset of said categories comprises a plurality of dimensional categories, such that each dimensional category represents discrete and independent concepts from other dimensional categories and each dimensional category represents one of said groups.
-
17. A computer system comprising:
-
a user input device for receiving an input query comprising a plurality of terms;
memory for storing a knowledge base comprising a plurality of categories, wherein a plurality of subsets of said categories are designated dimensional categories; and
processor unit coupled to said memory and said user input device for processing said terms of said input query to identify value terms that comprise content carrying capacity, for referencing said knowledge base to identify a dimensional category for each value term, and for generating a processed input query comprising, as a logical connector between two value terms, an AND if two respective value terms are associated with two different dimensional categories, and generating an OR if two respective value terms are associated with the same dimensional category.
-
Specification