Query refinement method for searching documents
First Claim
1. A method for refining an initial query phrase to search for documents of interest to a user, comprising the steps of:
- categorizing at least one document found in a search using the initial query phrase as of interest based upon feedback from the user;
categorizing at least one other document found in the search using the initial query phrase as not of interest based upon feedback from the user;
generating a list of keywords by analyzing only the categorized documents;
ranking as first keywords, the keywords in the list of keywords which occur in only the documents of interest;
ranking as second keywords, the keywords in the list of keywords which occur in only the documents not of interest;
forming a refined query phrase to search for documents which include one or more of a plurality of the highest ranked first keywords, and to filter out documents which include any one or more of a plurality of the highest ranked second keywords.
7 Assignments
0 Petitions
Accused Products
Abstract
A user views search results and subjectively determines if a document is desirable or undesirable. Only documents categorized by the user are analyzed for deriving a list of prospective keywords. The frequency of occurrence of each word of each document is derived. Keywords that occur only in desirable documents are good keywords. Keywords that occur only in undesirable documents are bad keywords. Keywords that occurs in both types are dirty keywords. The best keywords are the good keywords with the highest frequency of occurrence. The worst keywords are the bad keywords with the highest frequency of occurrence. A new query phrase includes the highest ranked good keywords and performs filtering using the highest ranked bad keywords. Key phrases are derived to clean dirty keywords into good key phrases. A key phrase also is derived from a good keyword and replaces the good keyword to narrow a search.
429 Citations
43 Claims
-
1. A method for refining an initial query phrase to search for documents of interest to a user, comprising the steps of:
-
categorizing at least one document found in a search using the initial query phrase as of interest based upon feedback from the user; categorizing at least one other document found in the search using the initial query phrase as not of interest based upon feedback from the user; generating a list of keywords by analyzing only the categorized documents; ranking as first keywords, the keywords in the list of keywords which occur in only the documents of interest; ranking as second keywords, the keywords in the list of keywords which occur in only the documents not of interest; forming a refined query phrase to search for documents which include one or more of a plurality of the highest ranked first keywords, and to filter out documents which include any one or more of a plurality of the highest ranked second keywords. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for refining an initial query phrase to search for web pages on the world wide web that are of interest to a user, comprising the steps of:
-
categorizing at least one web page found in a search using the initial query phrase as of interest based upon feedback from the user; categorizing at least one other web page found in the search using the initial query phrase as not of interest based upon feedback from the user; generating a list of keywords by analyzing only the categorized web pages; ranking as first keywords, the keywords in the list of keywords which occur in only the web pages of interest; ranking as second keywords, the keywords in the list of keywords which occur in only the web pages not of interest; forming a refined query phrase to search for web pages which include one or more of a plurality of the highest ranked first keywords, and to filter out web pages which include any one or more of a plurality of the highest ranked second keywords. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A system for refining an initial query phrase to search for documents of interest to a user, comprising:
-
a display device; an input device; means for accessing a document domain; processing means for categorizing at least one document found in a search using the initial query phrase as of interest based upon feedback from the user; processing means for categorizing at least one other document found in the search using the initial query phrase as not of interest based upon feedback from the user, processing means for ranking as first keywords, words which occur in only the documents of interest; processing means for ranking as second keywords, words which occur in only the documents not of interest; processing means for forming a refined query phrase to search for documents which include one or more of a plurality of the highest ranked first keywords, and to filter out documents which include any one or more of a plurality of the highest ranked second keywords. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A method for refining an initial query phrase to search for documents of interest to a user, comprising the steps of:
-
categorizing at least one document found in a search using the initial query phrase as of interest based upon feedback from the user; generating a list of words occurring in the at least one document categorized as of interest; ranking the words in said list of words as keywords; for at least one of the ranked keywords, replacing said one of the ranked keywords with a plurality of key phrases, wherein each one of the plurality of key phrases includes said one ranked keyword, and wherein each one of the plurality of key phrases is obtained by finding an occurrence of said one ranked keyword and combining said one ranked keyword with either one of a preceding word or following word in the respective occurrence of said keyword in a document of interest; forming a refined query phrase to search for documents which include one or more of a plurality of the keywords and key phrases. - View Dependent Claims (35, 36, 37, 38)
-
-
39. A system for refining an initial query phrase to search for documents of interest to a user, comprising:
-
a display device; an input device; means for accessing a document domain; processing means for categorizing at least one document found in a search using the initial query phrase as of interest based upon feedback from the user; processing means for ranking keywords which occur in the documents of interest; for at least one of the ranked keywords, processing means for replacing said one of the ranked keywords with a plurality of key phrases, wherein each one of the plurality of key phrases includes said one ranked keyword, and wherein each one of the plurality of key phrases is obtained by finding an occurrence of said one ranked keyword and combining said one ranked keyword with either one of a preceding word or following word in the respective occurrence of said keyword in a document of interest; and processing means for forming a refined query phrase to search for documents which include one or more of a plurality of the keywords and key phrases. - View Dependent Claims (40, 41, 42, 43)
-
Specification