Method for cross-linguistic document retrieval
First Claim
1. A method of retrieving documents from a database, comprising:
- generating a query in a first language;
parsing said query into a plurality of terms;
translating said plurality of terms into a second language;
listing a plurality of permutations of said translated terms;
testing each of said permutations against each document of said database to generate a score indicating the relevance of each of said permutations to each of said documents; and
retrieving documents from said database based on said score.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides a method and apparatus for retrieving documents that are stored in a language other than the language that is used to formulate a search query. This invention decomposes the query into terms and then translates each of the terms into terms of the language of the database. Once the database language terms have been listed, a series of subqueries is formed by creating all the possible combinations of the listed terms. Each subquery is then scored on each of the documents in the target language database. Only those subqueries that return meaningful scores are relevant to the query. Thus, the semantic meaning of the query is determined against the database itself and those documents in the database language that are most relevant to that semantic meaning are returned.
-
Citations
10 Claims
-
1. A method of retrieving documents from a database, comprising:
-
generating a query in a first language; parsing said query into a plurality of terms; translating said plurality of terms into a second language; listing a plurality of permutations of said translated terms; testing each of said permutations against each document of said database to generate a score indicating the relevance of each of said permutations to each of said documents; and retrieving documents from said database based on said score. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An apparatus for retrieving documents from a database, comprising:
-
a computer coupled to a storage unit and to a display unit, said storage unit stores a database in at least one file; said computer generates a query in a first language; said computer parses said query into a plurality of terms; said computer translates said plurality of terms into a second language corresponding to at least one language of the documents stored in said database; said computer generates a listing of a plurality of permutations of said translated terms; said computer tests each of said permutations against each document of said database to generate a score indicating the relevance of each of said permutations to each of said documents; and said computer retrieves documents from said database in said storage unit based on said score. - View Dependent Claims (7, 8, 9, 10)
-
Specification