Method and apparatus for cross-linguistic database retrieval
First Claim
1. A computer readable media bearing a sequence of computer executable instructions for retrieving documents from a database, said sequence of instructions enabling a computer to perform the steps of:
- generating a query in a first language;
parsing said query into a plurality of terms;
translating said plurality of terms into a second language;
listing a plurality of permutations of said translated terms;
testing each of said permutations against each document of said database to generate a score indicating the relevance of each of said permutations to each of said documents; and
retrieving documents from said database based on said score.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a method and apparatus for retrieving documents that are stored in a language other than the language that is used to formulate a search query. This invention decomposes the query into terms and then translates each of the terms into terms of the language of the database. Once the database language terms have been listed, a series of subqueries is formed by creating all the possible combinations of the listed terms. Each subquery is then scored on each of the documents in the target language database. Only those subqueries that return meaningful scores are relevant to the query. Thus, the semantic meaning of the query is determined against the database itself and those documents in the database language that are most relevant to that semantic meaning are returned.
-
Citations
10 Claims
-
1. A computer readable media bearing a sequence of computer executable instructions for retrieving documents from a database, said sequence of instructions enabling a computer to perform the steps of:
-
generating a query in a first language;
parsing said query into a plurality of terms;
translating said plurality of terms into a second language;
listing a plurality of permutations of said translated terms;
testing each of said permutations against each document of said database to generate a score indicating the relevance of each of said permutations to each of said documents; and
retrieving documents from said database based on said score. - View Dependent Claims (2, 3, 4, 5)
computing a score for each of said permutations of said translated terms against each document of said database having text in said second language wherein said score indicates a measure of relevance of each permutation to each document.
-
-
3. A computer readable media, as in claim 2, further comprising:
retrieving documents from said database based on said score.
-
4. A computer readable media, as in claim 2, further comprising:
-
sorting said scores of said permutations to identify the highest ranking permutation; and
retrieving a document associated with said identified highest ranking permutation.
-
-
5. A computer readable media, as in claim 4, further comprising:
retrieving a plurality of documents in an order corresponding to an order generated by said sorting of said permutations.
-
6. An apparatus for retrieving documents from a database, comprising:
-
a computer coupled to a storage unit and to a display unit, said storage unit stores a database in at least one file;
said computer generates a query in a first language;
said computer parses said query into a plurality of terms;
said computer translates said plurality of terms into a plurality of second languages, each of said plurality of second languages corresponding to at least one language of the documents stored in said database;
said computer generates a listing of a plurality of permutations of said translated terms;
said computer tests each of said permutations against each document of said database to generate a score indicating the relevance of each of said permutations to each of said documents; and
said computer retrieves documents from said database in said storage unit based on said score. - View Dependent Claims (7, 8, 9, 10)
said computer computes a score for each of said permutations of said translated terms against each document of said database having text in said second language wherein said score indicates a measure of relevance of each permutation to each document.
-
-
8. An apparatus for retrieving documents from a database, as in claim 7, wherein:
said computer retrieves documents from said database based on said score.
-
9. An apparatus for retrieving documents from a database, as in claim 7, wherein:
-
said computer sorts said scores of said permutations to identify the highest ranking permutation; and
said computer retrieves a document associated with said identified highest ranking permutation.
-
-
10. An apparatus for retrieving documents from a database, as in claim 9, wherein:
said computer retrieves a plurality of documents in an order corresponding to an order generated by said sorting of said permutations.
Specification