Suggesting and refining user input based on original user input

US 9,020,924 B2
Filed: 09/13/2012
Issued: 04/28/2015
Est. Priority Date: 05/04/2005
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

obtaining a plurality of queries received from a user in a current query session, wherein a most recent query received from the user in the current query session includes a first sequence of terms;

identifying a plurality of second sequences of terms having highest-ranked measures of similarity with the first sequence of terms, the respective measures of similarity being determined between (1) a first feature vector for the first sequence of terms and (2) respective second feature vectors for each of the second sequences of terms, each of the one or more second sequences of terms being a subsequence of the first sequence of terms or being a sequence of which the first sequence of terms is a subsequence, wherein each value of the first feature vector and the respective second feature vectors is based on a count of a corresponding co-occurring term occurring in a corpus adjacent to each respective sequence of terms;

generating a plurality of query suggestions for a particular query received in the current query session, including replacing the first sequence of terms in the most recent query with each of the plurality of highest-ranked second sequences of terms, wherein the first sequence of terms in the most recent query is a subsequence of the second sequence of terms or the second sequence of terms is a subsequence of the first sequence of terms in the most recent query;

determining a respective score for each of the plurality of query suggestions, wherein the score is based on a relevance between each query suggestion and the plurality of queries received from the user in the current query session; and

ranking the query suggestions by the determined scores.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods to generate modified/refined user inputs based on the original user input, such as a search query, are disclosed. The method may be implemented for Roman-based and/or non-Roman based language such as Chinese. The method may generally include receiving an original user input and identifying core terms therein, determining potential alternative inputs by replacing core term(s) in the original input with another term according to a similarity matrix and/or substituting a word sequence in the original input with another word sequence according to an expansion/contraction table where one word sequence is a substring of the other, computing likelihood of each potential alternative input, and selecting most likely alternative inputs according to a predetermined criteria, e.g., likelihood of the alternative input being at least that of the original input. A cache containing pre-computed original user inputs and corresponding alternative inputs may be provided.

56 Citations

View as Search Results

21 Claims

1. A computer-implemented method comprising:
- obtaining a plurality of queries received from a user in a current query session, wherein a most recent query received from the user in the current query session includes a first sequence of terms;
  
  identifying a plurality of second sequences of terms having highest-ranked measures of similarity with the first sequence of terms, the respective measures of similarity being determined between (1) a first feature vector for the first sequence of terms and (2) respective second feature vectors for each of the second sequences of terms, each of the one or more second sequences of terms being a subsequence of the first sequence of terms or being a sequence of which the first sequence of terms is a subsequence, wherein each value of the first feature vector and the respective second feature vectors is based on a count of a corresponding co-occurring term occurring in a corpus adjacent to each respective sequence of terms;
  
  generating a plurality of query suggestions for a particular query received in the current query session, including replacing the first sequence of terms in the most recent query with each of the plurality of highest-ranked second sequences of terms, wherein the first sequence of terms in the most recent query is a subsequence of the second sequence of terms or the second sequence of terms is a subsequence of the first sequence of terms in the most recent query;
  
  determining a respective score for each of the plurality of query suggestions, wherein the score is based on a relevance between each query suggestion and the plurality of queries received from the user in the current query session; and
  
  ranking the query suggestions by the determined scores.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 18)
- - 2. The method of claim 1, further comprising:
    - providing one or more highest-ranked query suggestions in response to receiving the particular query from the user.
  - 3. The method of claim 1, further comprising:
    - computing, for each feature of the respective feature vectors for the second sequences of terms, a respective point-wise mutual information score between the second sequence of terms and each corresponding co-occurring term, wherein the feature value for the co-occurring term in the feature vector for the second sequence of terms is the point-wise mutual information score.
  - 4. The method of claim 1, wherein identifying the plurality of second sequences of terms comprises:
    - generating an expansion/contraction table that includes pairs of sequences of terms having highest respective similarity measures; and
      
      identifying a plurality of pairs of sequences of terms that include the first sequence of terms in the expansion/contraction table.
  - 5. The method of claim 4, wherein generating the expansion/contraction table comprises:
    - determining a plurality of frequently occurring word sequences; and
      
      filtering out non-phrasal word sequences from the frequently occurring word sequences, wherein a non-phrasal word sequence is a word sequence that does not occur at a beginning or an end of at least a threshold number of queries in a collection of queries.
  - 6. The method of claim 1, wherein the score for a query suggestion is further based on:
    - a probability that the query suggestion will be selected; and
      
      a position of a selected search result that was previously provided in response to receiving the query suggestion as a search query.
  - 7. The method of claim 1, further comprising determining the relevance between each query suggestion and the plurality of queries received from the user in the current query session, including determining correlation values between aligned terms of each query suggestion and each query in the plurality of queries received from the user in the current query session.
  - 18. The method of claim 7, wherein a correlation value between aligned terms is based on a function of a plurality of weights, each weight corresponding to a respective relationship between the aligned terms, wherein the relationships between the aligned terms include one or more of a synonym relationship, an acronym relationship, an antonym relationship, a compound phrase relationship, or a same category relationship.

8. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  obtaining a plurality of queries received from a user in a current query session, wherein a most recent query received from the user in the current query session includes a first sequence of terms;
  
  identifying a plurality of second sequences of terms having highest-ranked measures of similarity with the first sequence of terms, the respective measures of similarity being determined between (1) a first feature vector for the first sequence of terms and (2) respective second feature vectors for each of the second sequences of terms, each of the one or more second sequences of terms being a subsequence of the first sequence of terms or being a sequence of which the first sequence of terms is a subsequence, wherein each value of the first feature vector and the respective second feature vectors is based on a count of a corresponding co-occurring term occurring in a corpus adjacent to each respective sequence of terms;
  
  generating a plurality of query suggestions for a particular query received in the current query session, including replacing the first sequence of terms in the most recent query with each of the plurality of highest-ranked second sequences of terms, wherein the first sequence of terms in the most recent query is a subsequence of the second sequence of terms or the second sequence of terms is a subsequence of the first sequence of terms in the most recent query;
  
  determining a respective score for each of the plurality of query suggestions, where the score is based on a relevance between each query suggestion and the plurality of queries received from the user in the current query session; and
  
  ranking the query suggestions by the determined scores.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein the operations further comprise:
    - providing one or more highest-ranked query suggestions in response to receiving the particular query from the user.
  - 10. The system of claim 9, wherein the operations further comprise:
    - computing, for each feature of the respective feature vectors for the second sequences of terms, a respective point-wise mutual information score between the second sequence of terms and each corresponding co-occurring term, wherein the feature value for the co-occurring term in the feature vector for the second sequence of terms is the point-wise mutual information score.
  - 11. The system of claim 9, wherein the operations further comprise:
    - generating an expansion/contraction table that includes pairs of sequences of terms having highest respective similarity measures; and
      
      identifying a plurality of pairs of sequences of terms that include the first sequence of terms in the expansion/contraction table.
  - 12. The system of claim 11, wherein generating the expansion/contraction table comprises:
    - determining a plurality of frequently occurring word sequences; and
      
      filtering out non-phrasal word sequences from the frequently occurring word sequences, wherein a non-phrasal word sequence is a word sequence that does not occur at a beginning or an end of at least a threshold number of queries in a collection of queries.
  - 13. The system of claim 8, wherein the score for a query suggestion is further based on:
    - a probability that the query suggestion will be selected; and
      
      a position of a selected search result that was previously provided in response to receive the query suggestion as a search query.
  - 14. The system of claim 8, wherein the operations further comprise determining the relevance between each query suggestion and the plurality of queries received form the user in the current query session, including determining correlation values between aligned terms of each query suggestion and each query in the plurality of queries received from the user in the current query session.

15. A computer program product, encoded on one or more non-transitory computer storage media, comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
- obtaining a plurality of queries received from a user in a current query session, wherein a most recent query received from the user in the current query session includes a first sequence of terms;
  
  identifying a plurality of second sequences of terms having highest-ranked measures of similarity with the first sequence of terms, the respective measures of similarity being determined between (1) a first feature vector for the first sequence of terms and (2) respective second feature vectors for each of the second sequences of terms, each of the one or more second sequences of terms being a subsequence of the first sequence of terms or being a sequence of which the first sequence of terms is a subsequence, wherein each value of the first feature vector and the respective second feature vectors is based on a count of a corresponding co-occurring term occurring in a corpus adjacent to each respective sequence of terms;
  
  generating a plurality of query suggestions for a particular query received in the current query session, including replacing the first sequence of terms in the most recent query with each of the plurality of highest-ranked second sequences of terms, wherein the first sequence of terms in the most recent query is a subsequence of the second sequence of terms or the second sequence of terms is a subsequence of the first sequence of terms in the most recent query;
  
  determining a respective score for each of the plurality of query suggestions, where the score is based on a relevance between each query suggestion and the plurality of queries received from the user in the current query session; and
  
  ranking the query suggestions by the determined scores.
- View Dependent Claims (16, 17, 19, 20, 21)
- - 16. The computer program product of claim 15, wherein the operations further comprise:
    - providing one or more highest-ranked query suggestions in response to receiving the particular query from the user.
  - 17. The computer program product of claim 15, wherein identifying the plurality of second sequences of terms comprises:
    - generating an expansion/contraction table that includes pairs of sequences of terms having highest respective similarity measures; and
      
      identifying a plurality of pairs of sequences of terms that include the first sequence of terms in the expansion/contraction table.
  - 19. The computer program product of claim 17, wherein generating the expansion/contraction table comprises:
    - determining a plurality of frequently occurring word sequences; and
      
      filtering out non-phrasal word sequences from the frequently occurring word sequences, wherein a non-phrasal word sequence is a word sequence that does not occur at a beginning or an end of at least a threshold number of queries in a collection of queries.
  - 20. The computer program product of claim 15, wherein the score for a query suggestion is further based on:
    - a probability that the query suggestion will be selected; and
      
      a position of a selected search result that was previously provided in response to receive the query suggestion as a search query.
  - 21. The computer program product of claim 15, wherein the operations further comprise determining the relevance between each query suggestion and the plurality of queries received form the user in the current query session, including determining correlation values between aligned terms of each query suggestion and each query in the plurality of queries received from the user in the current query session.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Wu, Jun, Lin, Dekang, Qian, Zhe, Zhou, Jie
Primary Examiner(s)
Ruiz, Angelica

Application Number

US13/615,518
Publication Number

US 20130103696A1
Time in Patent Office

957 Days
Field of Search
US Class Current

707/706
CPC Class Codes

G06F 16/242   Query formulation

G06F 16/24578   using ranking

G06F 16/3322   using system suggestions G0...

G06F 16/90324   using system suggestions

Suggesting and refining user input based on original user input

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

56 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Suggesting and refining user input based on original user input

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

56 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links