Query translation through dictionary adaptation
First Claim
1. A method of performing cross-lingual information retrieval, the method comprising:
- translating a query qs={ws1, . . . , wsl} in a source natural language represented as a starting query language model P(ws|qs) in the source natural language into a target natural language different from the source natural language to generate a starting query in the target natural language, wherein P(ws|qs)=0 for words ws that are not in the query qs={ws1, . . . , wsl};
performing a first information retrieval operation on a corpus of documents in the target natural language by inputting the starting query in the target natural language to a monolingual information retrieval system configured to operate in the target natural language in order to retrieve a set of pseudo-feedback documents in the target natural language;
generating an updated query in the target natural language represented as an updated query language model p(wt|qs) in the target natural language computed as;
4 Assignments
0 Petitions
Accused Products
Abstract
Cross-lingual information retrieval is disclosed, comprising: translating a received query from a source natural language into a target natural language; performing a first information retrieval operation on a corpus of documents in the target natural language using the translated query to retrieve a set of pseudo-feedback documents in the target natural language; re-translating the received query from the source natural language into the target natural language using a translation model derived from the set of pseudo-feedback documents in the target natural language; and performing a second information retrieval operation on the corpus of documents in the target natural language using the re-translated query to retrieve an updated set of documents in the target natural language.
217 Citations
9 Claims
-
1. A method of performing cross-lingual information retrieval, the method comprising:
-
translating a query qs={ws1, . . . , wsl} in a source natural language represented as a starting query language model P(ws|qs) in the source natural language into a target natural language different from the source natural language to generate a starting query in the target natural language, wherein P(ws|qs)=0 for words ws that are not in the query qs={ws1, . . . , wsl}; performing a first information retrieval operation on a corpus of documents in the target natural language by inputting the starting query in the target natural language to a monolingual information retrieval system configured to operate in the target natural language in order to retrieve a set of pseudo-feedback documents in the target natural language; generating an updated query in the target natural language represented as an updated query language model p(wt|qs) in the target natural language computed as; - View Dependent Claims (2, 3)
-
-
4. A cross-lingual information retrieval system comprising:
-
a monolingual information retrieval system configured to retrieve documents from a corpus of documents in a target natural language based on a received query in the target natural language; and a processor configured to add cross-lingual information retrieval capability to the monolingual information retrieval system, the processor performing a process including; representing a starting query in a source natural language by a starting query language model in the source natural language having zero values for words not in the starting query, translating the starting query in the source natural language into the target natural language and inputting the translated starting query to the monolingual information retrieval system to retrieve a set of pseudo-feedback documents in the target natural language, generating a source language-to-target language translation model based on the set of pseudo-feedback documents in the target natural language and the starting query language model, and re-translating the starting query from the source natural language into the target natural language using the translation model and inputting the re-translated starting query to the monolingual information retrieval system to retrieve an updated set of documents in the target natural language; wherein the translation model is generated by maximizing a likelihood of the set of pseudo-feedback documents in the target natural language given the starting query language model in the source natural language using an expectation maximization (EM) algorithm with the translated starting query as the starting translation model. - View Dependent Claims (5, 6, 7)
-
-
8. A non-transitory storage medium storing instructions executable to perform a cross-lingual information retrieval method comprising:
-
translating a query qs={ws1, . . . , wsl} in a source natural language represented as a starting query language model P(ws|qs) in the source natural language into a target natural language different from the source natural language to generate a starting query in the target natural language; performing a first information retrieval operation on a corpus of documents in the target natural language by inputting the starting query in the target natural language to a monolingual information retrieval system configured to operate in the target natural language in order to retrieve a set of pseudo-feedback documents in the target natural language; generating an updated query in the target natural language represented as an updated query language model p(wt|qs) in the target natural language computed as; - View Dependent Claims (9)
-
Specification