Query language identification
First Claim
Patent Images
1. A computer-implemented method comprising:
- receiving a query that is spoken by a user, the search query comprising one or more query terms;
determining an interface language that is historically associated with the user based on the interface language being a language of a user interface through which previous search queries that have been submitted by the user have been received;
processing the one or more terms of the query to identify a query language that is different than the interface language that is historically associated with the user, the processing comprising;
accessing a collection of query records, wherein the collection of query records includes distinct subsets of query records, wherein each distinct subset of query records is associated with a respective user interface of a plurality of user interfaces from which queries are received, wherein each query record associates a past query with one or more result documents, and wherein each result document has an associated natural language;
determining, by the system, the query language of the search query from the search query, and the distinct subset of query records that is associated with the user interface of the user, the query language being a natural language, the determining comprising;
for each of multiple languages,calculating a first score for each query term and the respective language, each first score indicating the likelihood that the respective query term is in the respective language, wherein the first score is calculated based on a plurality of documents, each document having an associated natural language,calculating a second score for the respective language, the second score indicating the likelihood that the search query is in the respective language given the interface language of the first user interface through which the search query was received, where the second score is calculated based on the plurality of query records, andcalculating a third score for the respective language, the third score being a combination of the first score for the respective language and the second score for the respective language; and
determining the query language based on the third scores for the multiple languages;
generating one or more search results that match the query language that is different than the interface language that is historically associated with the user; and
providing, for output, a representation of one or more of the search results.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer program products, for identifying the language of a search query. In one embodiment, the language of each term of a query is determined from the query terms and the language of the user interface a user used to enter the query. In another embodiment, an automatic interface language classifier is generated from a collection of past queries each submitted by a user. In some embodiments, a score is determined for each of multiple languages, each score indicating a likelihood that the query language is the corresponding one of the multiple languages.
-
Citations
20 Claims
-
1. A computer-implemented method comprising:
-
receiving a query that is spoken by a user, the search query comprising one or more query terms; determining an interface language that is historically associated with the user based on the interface language being a language of a user interface through which previous search queries that have been submitted by the user have been received; processing the one or more terms of the query to identify a query language that is different than the interface language that is historically associated with the user, the processing comprising; accessing a collection of query records, wherein the collection of query records includes distinct subsets of query records, wherein each distinct subset of query records is associated with a respective user interface of a plurality of user interfaces from which queries are received, wherein each query record associates a past query with one or more result documents, and wherein each result document has an associated natural language; determining, by the system, the query language of the search query from the search query, and the distinct subset of query records that is associated with the user interface of the user, the query language being a natural language, the determining comprising; for each of multiple languages, calculating a first score for each query term and the respective language, each first score indicating the likelihood that the respective query term is in the respective language, wherein the first score is calculated based on a plurality of documents, each document having an associated natural language, calculating a second score for the respective language, the second score indicating the likelihood that the search query is in the respective language given the interface language of the first user interface through which the search query was received, where the second score is calculated based on the plurality of query records, and calculating a third score for the respective language, the third score being a combination of the first score for the respective language and the second score for the respective language; and determining the query language based on the third scores for the multiple languages; generating one or more search results that match the query language that is different than the interface language that is historically associated with the user; and providing, for output, a representation of one or more of the search results. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving a query that is spoken by a user, the search query comprising one or more query terms; determining an interface language that is historically associated with the user based on the interface language being a language of a user interface through which previous search queries that have been submitted by the user have been received; processing the one or more terms of the query to identify a query language that is different than the interface language that is historically associated with the user, the processing comprising; accessing a collection of query records, wherein the collection of query records includes distinct subsets of query records, wherein each distinct subset of query records is associated with a respective user interface of a plurality of user interfaces from which queries are received, wherein each query record associates a past query with one or more result documents, and wherein each result document has an associated natural language; determining, by the system, the query language of the search query from the search query, and the distinct subset of query records that is associated with the user interface of the user, the query language being a natural language, the determining comprising; for each of multiple languages, calculating a first score for each query term and the respective language, each first score indicating the likelihood that the respective query term is in the respective language, wherein the first score is calculated based on a plurality of documents, each document having an associated natural language, calculating a second score for the respective language, the second score indicating the likelihood that the search query is in the respective language given the interface language of the first user interface through which the search query was received, where the second score is calculated based on the plurality of query records, and calculating a third score for the respective language, the third score being a combination of the first score for the respective language and the second score for the respective language; and determining the query language based on the third scores for the multiple languages; generating one or more search results that match the query language that is different than the interface language that is historically associated with the user; and providing, for output, a representation of one or more of the search results. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving a query that is spoken by a user, the search query comprising one or more query terms; determining an interface language that is historically associated with the user based on the interface language being a language of a user interface through which previous search queries that have been submitted by the user have been received; processing the one or more terms of the query to identify a query language that is different than the interface language that is historically associated with the user, the processing comprising; accessing a collection of query records, wherein the collection of query records includes distinct subsets of query records, wherein each distinct subset of query records is associated with a respective user interface of a plurality of user interfaces from which queries are received, wherein each query record associates a past query with one or more result documents, and wherein each result document has an associated natural language; determining, by the system, the query language of the search query from the search query, and the distinct subset of query records that is associated with the user interface of the user, the query language being a natural language, the determining comprising; for each of multiple languages, calculating a first score for each query term and the respective language, each first score indicating the likelihood that the respective query term is in the respective language, wherein the first score is calculated based on a plurality of documents, each document having an associated natural language, calculating a second score for the respective language, the second score indicating the likelihood that the search query is in the respective language given the interface language of the first user interface through which the search query was received, where the second score is calculated based on the plurality of query records, and calculating a third score for the respective language, the third score being a combination of the first score for the respective language and the second score for the respective language; and determining the query language based on the third scores for the multiple languages; generating one or more search results that match the query language that is different than the interface language that is historically associated with the user; and providing, for output, a representation of one or more of the search results. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification