×

Query language identification

  • US 9,727,605 B1
  • Filed: 04/09/2013
  • Issued: 08/08/2017
  • Est. Priority Date: 04/19/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • providing, by a system comprising one or more computers, a plurality of user interfaces through which search queries are received, wherein each user interface is in a respective interface language, and wherein each interface language is a natural language in which a respective user interface presents information;

    maintaining a collection of query records, wherein the collection of query records includes distinct subsets of query records, wherein each distinct subset of query records is associated with a respective user interface of the plurality of user interfaces, wherein each query record associates a past query with one or more result documents, and wherein each result document has an associated natural language;

    classifying each past query in the collection of query records based at least on;

    (i) the interface language of the user interface through which the past query was received, and (ii) at least one of;

    (a) a natural language of the one or more result documents associated with the past query, or (b) the natural language of one or more result documents that were selected;

    generating an initial distribution of languages associated with the past queries for each user interface of the plurality of user interfaces based on the classifying, wherein the initial distribution indicates, for each user interface of the plurality of user interfaces and for each of multiple natural languages, what proportion of the past queries from the plurality of query records were in the language for the interface;

    generating, based at least on the initial distribution of languages associated with the past queries for each user interface of the plurality of user interfaces, an interface language classifier that is trained to predict, for a given user interface and a given language, a proportion of queries that are received through the given user interface that are likely in the given language;

    receiving, in the system, through a first user interface of the plurality of user interfaces, a search query comprising one or more query terms;

    using the interface language classifier, that was generated based at least on the initial distribution of languages associated with the past queries for each user interface of the plurality of user interfaces, to determine a likelihood that the search query is in a particular natural language of the multiple natural languages, given that the first user interface is the user interface that received the query; and

    providing one or more results responsive to the search query received through the first user interface, wherein the one or more results comprises results in a most likely natural language of the search query, which is automatically determined by the interface language classifier from the multiple natural languages.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×