Randomized query generation and document relevance ranking for robust information retrieval from a database
First Claim
Patent Images
1. A information retrieval system for retrieving information from a database, the system comprising:
- a query string generator for receiving an input text string and for providing a plurality of query strings, each of the plurality of query strings having a plurality of terms wherein each of the plurality of terms are randomly selected from the input text string;
a database search interface coupled to the query string generator, the database search interface for receiving each of the plurality of query strings from the query string generator, for providing the plurality of query strings to the data base and for receiving search results from the database, the search results including a number of items; and
a ranking processor for receiving the search results from the database search interface and for providing a rank value for each of the items in the search results.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for randomly selecting terms from an input string to form a plurality of search queries is described. Each of the plurality of search queries can be provided to a database to locate database entries in the database. Database entries returned from a database search using the plurality of search queries may be ordered to provide a relevance ranked list of the database entries.
130 Citations
21 Claims
-
1. A information retrieval system for retrieving information from a database, the system comprising:
-
a query string generator for receiving an input text string and for providing a plurality of query strings, each of the plurality of query strings having a plurality of terms wherein each of the plurality of terms are randomly selected from the input text string; a database search interface coupled to the query string generator, the database search interface for receiving each of the plurality of query strings from the query string generator, for providing the plurality of query strings to the data base and for receiving search results from the database, the search results including a number of items; and a ranking processor for receiving the search results from the database search interface and for providing a rank value for each of the items in the search results. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for searching a database having a plurality of documents stored therein, the method comprising the steps of:
-
(a) identifying an input text string having a first plurality of terms; (b) randomly selecting a second plurality of terms from the first plurality of terms to form a first search string, the number of terms in the first search string being less than the number of terms in the input text string; (c) randomly selecting a third plurality of terms from the input text string, to form a second search string, wherein the number of terms in the first search string are equal to the number of terms in the second search string and wherein at least one of the terms in the second search string is not found in the first search string; (d) identifying each of the documents in the database which contain the first search string; (e) identifying each of the documents in the database which contain the second search string; (f) computing a rank value for each of the documents identified in steps (d) and (f) wherein each rank value for each respective document corresponds to the number of search strings with which the respective document was identified; and (g) listing each of the records in a predetermined order, wherein the order is determined by the corresponding rank values. - View Dependent Claims (9, 10)
-
-
11. A method for searching a database comprising the steps of:
-
(a) identifying a first text string having a first plurality of terms; (b) randomly selecting a first predetermined number of terms from the first plurality of terms to form a first search string, the number of terms in the first search string being less than the number of terms in the first text string; (c) randomly selecting a next predetermined number of terms from the first plurality of terms to form a next search string; (d) repeating step (c) a predetermined number of times to provide a predetermined number of next search strings, wherein the first search string and each of the next search strings form a plurality of search strings; (e) identifying each of a plurality of database entries which contain at least one of the plurality of search strings; (f) computing a rank value for each of the database entries identified in step (e) wherein each rank value for each respective database entry corresponds to the number of search strings from the plurality of search strings with which the respective database entry was identified; and (g) listing each of the database entries in a predetermined order, wherein the order is determined by the corresponding rank values. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A method for searching a database comprising the steps of:
-
(a) identifying a first text string having a first plurality of terms; (b) randomly selecting a first predetermined number of terms from the first plurality of terms to form a first search string, the number of terms in the first search string being less than the number of terms in the first text string; (c) identifying each of a plurality of database entries which contain each of the terms in the first search string; (d) randomly selecting a next predetermined number of terms from the first plurality of terms to form a next search string; (e) identifying each of a plurality of database entries which contain each of the terms in the next search string; (f) repeating steps (d) and (e) a predetermined number of times to provide a predetermined number of next search strings, wherein the first search string and each of the next search strings form a plurality of search strings; (g) computing a rank value for each of the database entries identified in steps (c), (e) and (f) wherein each rank value for each respective database entry corresponds to the number of search strings from the plurality of search strings with which the respective database entry was identified; and (g) listing each of the database entries in a predetermined order, wherein the order is determined by the corresponding rank values. - View Dependent Claims (18)
-
-
19. A computer program product for use with an information retrieval system, the computer program product comprising:
-
a computer usable medium having computer readable program code means for identifying an input text string having a first plurality of terms; a computer usable medium having computer readable program code means for randomly selecting a first predetermined number of terms from the first plurality of terms to form a first search string, the number of terms in the first search string being less than the number of terms in the input text string; a computer usable medium having computer readable program code means for randomly selecting a second predetermined number of terms from the input text string, to form a second search string, wherein at least one of the terms in the second search string is not found in the first search string; a computer usable medium having computer readable program code means for identifying each of the documents in the database which contain the first search string; a computer usable medium having computer readable program code means for identifying each of the documents in the database which contain the second search string; a computer usable medium having computer readable program code means for computing a rank value for each of the documents identified by said computer usable medium having computer readable program code means for identifying each of the documents in the database which contain the first search string and said computer usable medium having computer readable program code means for identifying each of the documents in the database which contain the second search string, wherein each rank value for each respective document corresponds to the number of search strings with which the respective document was identified; and a computer usable medium having computer readable program code means for listing each of the records in a predetermined order, wherein the order is determined by the corresponding rank values. - View Dependent Claims (20, 21)
-
Specification