EVALUATING QUERY TRANSLATIONS FOR CROSS-LANGUAGE QUERY SUGGESTION
First Claim
1. A computer-implemented method, comprising:
- receiving a query written in a first language, the query being a primary-language query suggestion generated based on a user input submitted to a search engine;
obtaining one or more unique candidate segmentations of the query in the first language, each unique candidate segmentation consisting of a respective sequence of segments resulted from segmenting the query in the first language;
for each of the one or more unique candidate segmentations, determining a respective set of one or more candidate translations in a second language by translating the respective sequence of segments of the candidate segmentation;
for each candidate translation of each of the one or more unique candidate segmentations;
determining a respective segmentation quality for the unique candidate segmentation based at least in part on how many stop words have been removed from the respective sequence of segments of the unique candidate segmentation and a respective first frequency of occurrence of the unique candidate segmentation in a first query log as a complete query written in the first language; and
determining a respective score for the candidate translation based at least on the respective segmentation quality determined for the unique candidate segmentation and a respective second frequency of occurrence of the candidate translation in a second query log as a complete query written in the second language; and
providing at least one of the candidate translations as a cross-language query suggestion for the query based on respective scores of the candidate translations.
2 Assignments
0 Petitions
Accused Products
Abstract
Computer-implemented methods, systems, computer program products for generating cross-language query suggestions are described. For each query suggestion written in a first natural language, candidate segmentations are generated from the query suggestion, and candidate translations are generated from each candidate segmentation. The candidate translations are evaluated based on a measure of segmentation quality associated with the respective candidate segmentation from which each candidate translation is derived, and a frequency of occurrence of the candidate translation in a target language query log. The measure of segmentation quality associated with each candidate segmentation is further based on a frequency of occurrence of the candidate segmentation in a source language query log. A candidate translation is provided as a cross-language query suggestion for the primary language query suggestion based on the result of the evaluation.
293 Citations
11 Claims
-
1. A computer-implemented method, comprising:
-
receiving a query written in a first language, the query being a primary-language query suggestion generated based on a user input submitted to a search engine; obtaining one or more unique candidate segmentations of the query in the first language, each unique candidate segmentation consisting of a respective sequence of segments resulted from segmenting the query in the first language; for each of the one or more unique candidate segmentations, determining a respective set of one or more candidate translations in a second language by translating the respective sequence of segments of the candidate segmentation; for each candidate translation of each of the one or more unique candidate segmentations; determining a respective segmentation quality for the unique candidate segmentation based at least in part on how many stop words have been removed from the respective sequence of segments of the unique candidate segmentation and a respective first frequency of occurrence of the unique candidate segmentation in a first query log as a complete query written in the first language; and determining a respective score for the candidate translation based at least on the respective segmentation quality determined for the unique candidate segmentation and a respective second frequency of occurrence of the candidate translation in a second query log as a complete query written in the second language; and providing at least one of the candidate translations as a cross-language query suggestion for the query based on respective scores of the candidate translations.
-
-
2. A computer-implemented method, comprising:
-
receiving a query written in a first language; obtaining one or more unique candidate segmentations of the query in the first language, each unique candidate segmentation consisting of a respective sequence of segments resulted from segmenting the query in the first language; for each of the one or more unique candidate segmentations; determining a respective measure of segmentation quality for the unique candidate segmentation; and obtaining a respective set of one or more candidate translations in a second language by translating the respective sequence of segments of the candidate segmentation; for each candidate translation of each of the one or more unique candidate segmentations; determining a first frequency of occurrence of the candidate translation in a first query log as a complete query written in the second language; and determining a respective score for the candidate translation based at least on the first frequency of occurrence of the candidate translation in the first query log as a complete query written in the second language, and the measure of segmentation quality for the candidate segmentation; and providing at least one of the candidate translations as a cross-language query suggestion for the query based on respective scores of the candidate translations. - View Dependent Claims (3, 4, 5, 6)
-
-
7. A system, comprising:
-
one or more processors; and memory having instructions stored thereon, the instructions, when executed by the one or more processors, cause the one or more processors to perform operations comprising; receiving a query written in a first language; obtaining one or more unique candidate segmentations of the query in the first language, each unique candidate segmentation consisting of a respective sequence of segments resulted from segmenting the query in the first language; for each of the one or more unique candidate segmentations; determining a respective measure of segmentation quality for the unique candidate segmentation; and obtaining a respective set of one or more candidate translation in a second language by translating the respective sequence of segments of the candidate segmentation; for each candidate translation of each of the one or more unique candidate segmentations; determining a first frequency of occurrence of the candidate translation in a first query log as a complete query written in the second language; and determining a respective score for the candidate translation based at least on the first frequency of occurrence of the candidate translation in the first query log as a complete query written in the second language, and the measure of segmentation quality for the candidate segmentation; and providing at least one of the candidate translations as a cross-language query suggestion for the query based on respective scores of the candidate translations. - View Dependent Claims (8, 9, 10, 11)
-
Specification