Evaluating pronouns in context

US 9,378,730 B1
Filed: 11/12/2013
Issued: 06/28/2016
Est. Priority Date: 10/10/2012
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

obtaining, by a speech recognition engine implemented on a mobile computing device, a transcription of an utterance encoded in an audio signal;

determining, by the speech recognition engine, that the transcription includes a pronoun and one or more keywords associated with a command;

disambiguating, by the speech recognition engine, the pronoun based on an item of content that is identified by a referring application, wherein the referring application is an application executing on the mobile computing device through which recording of the audio signal was initiated;

generating, by the speech recognition engine, the command using the keywords and the disambiguated pronoun; and

submitting the generated command for execution.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, computer program products, and systems are described for receiving, by a speech recognition engine, audio data that encodes an utterance and determining, by the speech recognition engine, that a transcription of the utterance includes one or more keywords associated with a command, and a pronoun. In addition, the methods, computer program products, and systems described herein pertain to transmitting a disambiguation request to an application, wherein the disambiguation request identifies the pronoun, receiving, by the speech recognition engine, a response to the disambiguation request, wherein the response references an item of content identified by the application, and generating, by the speech recognition engine, the command using the keywords and the response.

Citations

20 Claims

1. A computer-implemented method comprising:
- obtaining, by a speech recognition engine implemented on a mobile computing device, a transcription of an utterance encoded in an audio signal;
  
  determining, by the speech recognition engine, that the transcription includes a pronoun and one or more keywords associated with a command;
  
  disambiguating, by the speech recognition engine, the pronoun based on an item of content that is identified by a referring application, wherein the referring application is an application executing on the mobile computing device through which recording of the audio signal was initiated;
  
  generating, by the speech recognition engine, the command using the keywords and the disambiguated pronoun; and
  
  submitting the generated command for execution.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein disambiguating, by the speech recognition engine, the pronoun based on an item of content that is identified by a referring application comprises:
    - transmitting one or more disambiguation requests to a disambiguation engine;
      
      receiving, by the speech recognition engine, a response to the one or more disambiguation requests;
      
      evaluating the responses to the one or more disambiguation requests to determine a semantic connection between the referring application and the command; and
      
      generating a mapping of the referring application to the command.
  - 3. The method of claim 2, wherein receiving, by the speech recognition engine, a response to the one or more disambiguation requests further comprises:
    - receiving a first response to a first disambiguation request;
      
      after receiving the first response to the first disambiguation request, receiving a response to one or more additional disambiguation requests;
      
      merging the response to the one or more additional disambiguation requests with the first response to the first disambiguation request; and
      
      generating a set of weighted results by weighting each of the merged responses based on the likelihood that each respective merged response is relevant to disambiguation of the pronoun.
  - 4. The method of claim 1, wherein generating the command comprises accessing predetermined rules pertaining to one or more keywords associated with the pronoun.
  - 5. The method of claim 1, wherein disambiguating the pronoun further comprises receiving, by the speech recognition engine, data from the referring application that includes a GPS location and a user identifier.
  - 6. The method of claim 5, wherein the speech recognition engine employs one or more predetermined rules based on the GPS location and the user identifier.
  - 7. The method of claim 1, further comprising receiving, by the speech recognition engine, data indicating a selection of a control for initiating speech recognition that is presented by the referring application.

8. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  obtaining, by a speech recognition engine implemented on a mobile computing device, a transcription of an utterance encoded in an audio signal;
  
  determining, by the speech recognition engine, that the transcription includes a pronoun and one or more keywords associated with a command;
  
  disambiguating, by the speech recognition engine, the pronoun based on an item of content that is identified by a referring application, wherein the referring application is an application executing on the mobile computing device through which recording of the audio signal was initiated;
  
  generating, by the speech recognition engine, the command using the keywords and the disambiguated pronoun; and
  
  submitting the generated command for execution.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein disambiguating, by the speech recognition engine, the pronoun based on an item of content that is identified by a referring application comprises:
    - transmitting one or more disambiguation requests to a disambiguation engine;
      
      receiving, by the speech recognition engine, a response to the one or more disambiguation requests;
      
      evaluating the responses to the one or more disambiguation requests to determine a semantic connection between the referring application and the command; and
      
      generating a mapping of the referring application to the command.
  - 10. The system of claim 8, wherein receiving, by the speech recognition engine, a response to the one or more disambiguation requests further comprises:
    - receiving a first response to a first disambiguation request;
      
      after receiving the first response to the first disambiguation request, receiving a response to one or more additional disambiguation requests;
      
      merging the response to the one or more additional disambiguation requests with the first response to the first disambiguation request; and
      
      generating a set of weighted results by weighting each of the merged responses based on the likelihood that each respective merged response is relevant to disambiguation of the pronoun.
  - 11. The system of claim 8, wherein generating the command comprises accessing predetermined rules pertaining to one or more keywords associated with the pronoun.
  - 12. The system of claim 8, wherein the disambiguating the prounoun further comprises receiving, by the speech recognition engine, data from the referring application that includes a GPS location and a user identifier.
  - 13. The system of claim 12, wherein the speech recognition engine employs one or more predetermined rules based on the GPS location and the user identifier.
  - 14. The system of claim 8, wherein the operations further comprise receiving, by the speech recognition engine, data indicating a selection of a control for initiating speech recognition that is presented by the referring application.

15. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- obtaining, by a speech recognition engine implemented on a mobile computing device, a transcription of an utterance encoded in an audio signal;
  
  determining, by the speech recognition engine, that the transcription includes a pronoun and one or more keywords associated with a command;
  
  disambiguating, by the speech recognition engine, the pronoun based on an item of content that is identified by a referring application, wherein the referring application is an application executing on the mobile computing device through which recording of the audio signal was initiated;
  
  generating, by the speech recognition engine, the command using the keywords and the disambiguated pronoun; and
  
  submitting the generated command for execution.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The computer-readable medium of claim 15, wherein disambiguating, by the speech recognition engine, the pronoun based on an item of content that is identified by a referring application comprises:
    - transmitting one or more disambiguation requests to a disambiguation engine;
      
      receiving, by the speech recognition engine, a response to the one or more disambiguation requests;
      
      evaluating the responses to the one or more disambiguation requests to determine a semantic connection between the referring application and the command; and
      
      generating a mapping of the referring application to the command.
  - 17. The computer-readable medium of claim 15, wherein receiving, by the speech recognition engine, a response to the one or more disambiguation requests further comprises:
    - receiving a first response to a first disambiguation request;
      
      after receiving the first response to the first disambiguation request, receiving a response to one or more additional disambiguation requests;
      
      merging the response to the one or more additional disambiguation requests with the first response to the first disambiguation request; and
      
      generating a set of weighted results by weighting each of the merged responses based on the likelihood that each respective merged response is relevant to disambiguation of the pronoun.
  - 18. The computer-readable medium of claim 15, wherein generating the command comprises accessing predetermined rules pertaining to one or more keywords associated with the pronoun.
  - 19. The computer-readable medium of claim 15, wherein disambiguating the pronoun further comprises receiving, by the speech recognition engine, data from the referring application that includes a GPS location and a user identifier.
  - 20. The computer-readable medium of claim 19, wherein the speech recognition engine employs one or more predetermined rules based on the GPS location and the user identifier.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Tickner, Simon, Cohen, Richard Z.
Primary Examiner(s)
Saint Cyr, Leonard

Application Number

US14/077,368
Time in Patent Office

959 Days
Field of Search

704/231, 704/246, 704/247, 704/251, 704/252
US Class Current

1/1
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

Evaluating pronouns in context

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Evaluating pronouns in context

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links