Iterative technique for phrase query formation and an information retrieval system employing same
First Claim
1. A method of selectively searching an automated data base with data processing apparatus, said data base containing a corpus of documents comprising sequences of word data stored as stop-words and non-stop-words in a memory, said method comprising the steps of:
- a) input at least one query word to said data processing apparatus;
b) using said data processing apparatus, determining a word data search key based upon said at least one query word;
c) using said data processing apparatus, searching said document corpus to identify all occurrences of a match between said search key and said document corpus word data;
d) using said data processing apparatus, sequentially displaying said matches occurring in said document corpus, each match being displayed as a phrase containing the word data matching said search key, a non-stop-word next adjacent to said matching word data, and all intervening stop-words, between said matching word data and said next adjacent non-stop-word, said data processing apparatus displaying the phrases for multiple matches simultaneously and so that the respective non-stop-words are aligned with each other in a common column; and
e) selecting one of said next adjacent non-stop-words as a new query word and successively repeating steps b)-d) using the selected new query word to locate documents of interest from said document corpus.
4 Assignments
0 Petitions
Accused Products
Abstract
An information retrieval system and method are provided in which an operator inputs one or more query words which are used to determine a search key for searching through a corpus of documents, and which returns any matches between the search key and the corpus of documents as a phrase containing the word data matching the query word(s), a non-stop (content) word next adjacent to the matching word data, and all intervening stop-words between the matching word data and the next adjacent non-stop word. The operator, after reviewing one or more of the returned phrases can then use one or more of the next adjacent non-stop-words as new query words to reformulate the search key and perform a subsequent search through the document corpus. This process can be conducted iteratively, until the appropriate documents of interest are located. The additional non-stop-words from each phrase are preferably aligned with each other (e.g., by columnation) to ease viewing of the "new" content words.
762 Citations
44 Claims
-
1. A method of selectively searching an automated data base with data processing apparatus, said data base containing a corpus of documents comprising sequences of word data stored as stop-words and non-stop-words in a memory, said method comprising the steps of:
-
a) input at least one query word to said data processing apparatus; b) using said data processing apparatus, determining a word data search key based upon said at least one query word; c) using said data processing apparatus, searching said document corpus to identify all occurrences of a match between said search key and said document corpus word data; d) using said data processing apparatus, sequentially displaying said matches occurring in said document corpus, each match being displayed as a phrase containing the word data matching said search key, a non-stop-word next adjacent to said matching word data, and all intervening stop-words, between said matching word data and said next adjacent non-stop-word, said data processing apparatus displaying the phrases for multiple matches simultaneously and so that the respective non-stop-words are aligned with each other in a common column; and e) selecting one of said next adjacent non-stop-words as a new query word and successively repeating steps b)-d) using the selected new query word to locate documents of interest from said document corpus. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method of selectively searching an automated data base with data processing apparatus, said data base containing a corpus of documents comprising sequences of word data stored as stop-words and non-stop-words in a memory, said method comprising the steps of:
-
a) input at least one query word to said data processing apparatus; b) using said data processing apparatus, determining a word data search key based upon said at least one query word; c) using said data processing apparatus, searching said document corpus to identify all occurrences of a match between said search key and said document corpus word data; and d) using said data processing apparatus, displaying each match as a phrase containing the word data matching said search key, a single non-stop-word next adjacent to said matching word data in a distinctive form different from the display of other word data in the displayed phrases, and all intervening stop-words between said matching word data and said single next adjacent non-stop-word. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A document retrieval system in which is stored a corpus of documents comprising sequences of word data stored as stop-words and non-stop-words in a memory, an apparatus for selectively searching through the corpus of documents comprising:
-
means for receiving at least one query word input by an operator of the document retrieval system; means for searching through said document corpus and identifying all occurrences of a match between said document corpus word data and a search key determined based upon said at least one query word; and means for displaying each match as a phrase containing the word data matching said search key, a single non-stop-word next adjacent to said matching word data, and all intervening stop-words between said matching word data and said single non-stop-word, wherein said means for displaying simultaneously displays phrases for multiple matches so that the respective non-stop-words from each phrase are aligned with each other in a common column. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. A document retrieval system in which is stored a corpus of documents comprising sequences of word data stored as stop-words and non-stop-words in a memory, an apparatus for selectively searching through the corpus of documents comprising:
-
means for receiving at least one query word input by an operator of the document retrieval system; means for searching through said document corpus and identifying all occurrences of a match between said document corpus word data and a search key determined based upon said at least one query word; and means for sequentially displaying each match as a phrase containing the word data matching said search key, at least one non-stop-word next adjacent to said matching word data in a distinctive form different from the display of other word data in the displayed phrases, and all intervening stop-words between said matching word data and said at least one non-stop-word, each phrase being displayed as no more than a single line of text. - View Dependent Claims (40, 41, 42, 43, 44)
-
Specification