MINING TRANSLITERATIONS FOR OUT-OF-VOCABULARY QUERY TERMS
First Claim
1. A method for retrieving information, implemented by an information retrieval system, comprising:
- receiving a query that includes an in-vocabulary component and an out-of-vocabulary (OOV) component, the in-vocabulary component comprising at least one term that is included in a translation dictionary, and the OOV component comprising at least one term that is not included in the translation dictionary;
identifying a body of information associated with the in-vocabulary component of the query, using the translation dictionary;
performing mining analysis to extract at least one viable transliteration associated with the OOV component of the query from the body of information;
updating the translation dictionary to include said at least one viable transliteration, to provide an updated translation dictionary; and
identifying another body of information associated with the in-vocabulary component of the query, using the updated translation dictionary.
2 Assignments
0 Petitions
Accused Products
Abstract
An approach is described for using a query expressed in a source language to retrieve information expressed in a target language. The approach uses a translation dictionary to convert terms in the query from the source language to appropriate terms in the target language. The approach determines viable transliterations for out-of-vocabulary (OOV) query terms by retrieving a body of information based on an in-vocabulary component of the query, and then mining the body of information to identify the viable transliterations for the OOV query terms. The approach then adds the viable transliterations to the translation dictionary. The retrieval, mining, and adding operations can be repeated one or more or times.
87 Citations
20 Claims
-
1. A method for retrieving information, implemented by an information retrieval system, comprising:
-
receiving a query that includes an in-vocabulary component and an out-of-vocabulary (OOV) component, the in-vocabulary component comprising at least one term that is included in a translation dictionary, and the OOV component comprising at least one term that is not included in the translation dictionary; identifying a body of information associated with the in-vocabulary component of the query, using the translation dictionary; performing mining analysis to extract at least one viable transliteration associated with the OOV component of the query from the body of information; updating the translation dictionary to include said at least one viable transliteration, to provide an updated translation dictionary; and identifying another body of information associated with the in-vocabulary component of the query, using the updated translation dictionary. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An information retrieval system for retrieving information, comprising:
-
a data store providing a translation dictionary for correlating terms in a source language to corresponding terms in a target language; and a transliteration processing module configured to convert queries in the source language to respective counterparts in the target language, the queries encompassing a type of query that includes at an in-vocabulary component and an out-of-vocabulary (OOV) component, the in-vocabulary component comprising at least one term that is included in the translation dictionary, and the OOV component comprising at least one term that is not included in the translation dictionary, the transliteration processing module comprising; an in-vocabulary determination module configured to determine at least one in-vocabulary translation associated with the in-vocabulary component of the query; a mining module configured to identify, within a body of information, at least one viable transliteration associated with the OOV component of the query, the body of information being identified based on said at least one in-vocabulary translation provided by the in-vocabulary determination module; and an updating module configured to add said at least one viable transliteration to the translation dictionary to provide an updated translation dictionary. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A computer-readable medium for storing computer-readable instructions, the computer-readable instructions providing a mining module when executed by one or more processing devices, the computer-readable instructions comprising:
-
logic configured to identify at least one viable transliteration for an out-of-vocabulary (OOV) term within a query by mining an identified body of information, wherein the identified body of information is retrieved based on an in-vocabulary component of the query, wherein the OOV term corresponds to a term of the query that is not presently included in a translation dictionary, and the in-vocabulary component corresponds at least one term that is included in the translation dictionary. - View Dependent Claims (18, 19, 20)
-
Specification