Systems and methods for searching using queries written in a different character-set and/or language from the target pages
First Claim
Patent Images
1. A computer-implemented method, the method comprising:
- receiving a search query from a user device, wherein the search query includes one or more terms, each term being written in a first format;
translating, using a probabilistic dictionary, the one or more terms of the search query into a group of translated search queries, each translated search query having one or more terms in a second format, wherein the probabilistic dictionary includes a mapping of terms from the first format to the second format according to a respective calculated probability that a particular term in the first format corresponds to a term in the second format;
using a search engine to identify a plurality of documents written in the second format that are responsive to the group of translated search queries;
providing search results written in the second format to the user device, the search results referencing one or more of the identified documents;
obtaining click data from the user device indicative of user selections of one or more of the search results written in the second format; and
modifying the probabilistic dictionary of term mappings based at least in part on the obtained click data indicative of user selections of one or more of the search results written in the second format and adjusting at least one probability associated with at least one mapping in the probabilistic dictionary.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus consistent with the invention allow a user to submit an ambiguous search query and to receive relevant search results. Queries can be expressed using character sets and/or languages that are different from the character set and/or language of at least some of the data that is to be searched. A translation between these character sets and/or languages can be performed by examining the use of terms in aligned text. Probabilities can be associated with each possible translation. Refinements can be made to these probabilities by examining user interactions with the search results.
-
Citations
18 Claims
-
1. A computer-implemented method, the method comprising:
-
receiving a search query from a user device, wherein the search query includes one or more terms, each term being written in a first format; translating, using a probabilistic dictionary, the one or more terms of the search query into a group of translated search queries, each translated search query having one or more terms in a second format, wherein the probabilistic dictionary includes a mapping of terms from the first format to the second format according to a respective calculated probability that a particular term in the first format corresponds to a term in the second format; using a search engine to identify a plurality of documents written in the second format that are responsive to the group of translated search queries; providing search results written in the second format to the user device, the search results referencing one or more of the identified documents; obtaining click data from the user device indicative of user selections of one or more of the search results written in the second format; and modifying the probabilistic dictionary of term mappings based at least in part on the obtained click data indicative of user selections of one or more of the search results written in the second format and adjusting at least one probability associated with at least one mapping in the probabilistic dictionary. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
a computer readable medium including instructions; and data processing apparatus configured to execute the instructions to perform operations including; receiving a search query from a user device, wherein the search query includes one or more terms, each term being written in a first format; translating, using a probabilistic dictionary, the one or more terms of the search query into a group of translated search queries, each translated search query having one or more terms in a second format, wherein the probabilistic dictionary includes a mapping of terms from the first format to the second format according to a respective calculated probability that a particular term in the first format corresponds to a term in the second format; using a search engine to identify a plurality of documents written in the second format that are responsive to the group of translated search queries; providing search results written in the second format to the user device, the search results referencing one or more of the identified documents; obtaining click data from the user device indicative of user selections of one or more of the search results written in the second format; and modifying the probabilistic dictionary of term mappings based at least in part on the obtained click data indicative of user selections of one or more of the search results written in the second format and adjusting at least one probability associated with at least one mapping in the probabilistic dictionary. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer storage medium encoded with a computer program, the program comprising instructions that, when executed by data processing apparatus, cause the data processing apparatus to perform operations comprising:
-
receiving a search query from a user device, wherein the search query includes one or more terms, each term being written in a first format; translating, using a probabilistic dictionary, the one or more terms of the search query into a group of translated search queries, each translated search query having one or more terms in a second format, wherein the probabilistic dictionary includes a mapping of terms from the first format to the second format according to a respective calculated probability that a particular term in the first format corresponds to a term in the second format; using a search engine to identify a plurality of documents written in the second format that are responsive to the group of translated search queries; providing search results written in the second format to the user device, the search results referencing one or more the identified documents; obtaining click data from the user device indicative of user selections of one or more of the search results written in the second format; and modifying the probabilistic dictionary of term mappings based at least in part on the obtained click data indicative of user selections of one or more of the search results written in the second format and adjusting at least one probability associated with at least one mapping in the probabilistic dictionary. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification