VOICE RECOGNITION SYSTEM

US 20190214012A1
Filed: 03/14/2019
Published: 07/11/2019
Est. Priority Date: 01/06/2016
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, at an automatic speech recognition (ASR) system, a current voice input from a user, the current voice input associated with at least two contexts, each context of the at least two contexts having a respective weight indicating a likelihood that the voice input is associated with the respective context;

generating, by the ASR system, an intermediate recognition result of the current voice input from the user;

adjusting, by the ASR system, the respective weights of the at least two contexts based on the intermediate recognition result; and

transcribing, by the ASR system, the current voice input using a language model, the language model biasing the transcription of the voice input toward one of the at least two contexts based on the adjusted weights.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for voice recognition. In one aspect, a method includes the actions of receiving a voice input; determining a transcription for the voice input, wherein determining the transcription for the voice input includes, for a plurality of segments of the voice input: obtaining a first candidate transcription for a first segment of the voice input; determining one or more contexts associated with the first candidate transcription; adjusting a respective weight for each of the one or more contexts; and determining a second candidate transcription for a second segment of the voice input based in part on the adjusted weights; and providing the transcription of the plurality of segments of the voice input for output.

1 Citation

20 Claims

1. A method comprising:
- receiving, at an automatic speech recognition (ASR) system, a current voice input from a user, the current voice input associated with at least two contexts, each context of the at least two contexts having a respective weight indicating a likelihood that the voice input is associated with the respective context;
  
  generating, by the ASR system, an intermediate recognition result of the current voice input from the user;
  
  adjusting, by the ASR system, the respective weights of the at least two contexts based on the intermediate recognition result; and
  
  transcribing, by the ASR system, the current voice input using a language model, the language model biasing the transcription of the voice input toward one of the at least two contexts based on the adjusted weights.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein the language model comprises an N-gram model.
  - 3. The method of claim 1, wherein adjusting the respective weights of the at least two contexts associated with the current voice input comprises boosting the respective base weight for at least one of the at least two contexts.
  - 4. The method of claim 1, wherein adjusting the respective weights of the at least two contexts based on the intermediate recognition result comprises:
    - determining a most relevant one of the at least two contexts by identifying particular keywords in the intermediate recognition result; and
      
      increasing the respective weight of the most relevant one of the at least two contexts.
  - 5. The method of claim 1, wherein the current voice input from the user is configured to invoke a software application to perform an action using the transcription of the current voice input.
  - 6. The method of claim 1, further comprising providing the transcription of the current voice input to a dialogue system interacting with the user.
  - 7. The method of claim 1, wherein at least one of the at least two contexts comprises data indicating a current time when the current voice input is received at the ASR system.
  - 8. The method of claim 1, wherein at least one of the at least two contexts associated with the current voice input is based on one or more prior voice inputs from the user within a past time period of the current voice input.
  - 9. The method of claim 1, wherein at least one of the at least two contexts comprises named-entities associated with a particular category.
  - 10. The method of claim 1, wherein the ASR system resides on a server in communication with a computing device associated with the user, the computing device configured to capture the current voice input spoken by the user and transmit the captured voice input to the ASR system.

11. An automatic speech recognition (ASR) system comprising:
- data processing hardware; and
  
  memory hardware in communication with the data processing hardware and storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising;
  
  receiving a current voice input from a user, the current voice input associated with at least two contexts, each context of the at least two contexts having a respective weight indicating a likelihood that the voice input is associated with the respective context;
  
  generating an intermediate recognition result of the current voice input from the user;
  
  adjusting the respective weights of the at least two contexts based on the intermediate recognition result; and
  
  transcribing the current voice input using a language model, the language model biasing the transcription of the voice input toward one of the at least two contexts based on the adjusted weights.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The ASR system of claim 11, wherein the language model comprises an N-gram model.
  - 13. The ASR system of claim 12, wherein adjusting the respective weights of the at least two contexts associated with the current voice input comprises boosting the respective base weight for at least one of the at least two contexts.
  - 14. The ASR system of claim 11, wherein adjusting the respective weights of the at least two contexts based on the intermediate recognition result comprises:
    - determining a most relevant one of the at least two contexts by identifying particular keywords in the intermediate recognition result; and
      
      increasing the respective weight of the most relevant one of the at least two contexts.
  - 15. The ASR system of claim 11, wherein the current voice input from the user is configured to invoke a software application to perform an action using the transcription of the current voice input.
  - 16. The ASR system of claim 11, wherein the operations further comprise providing the transcription of the current voice input to a dialogue system interacting with the user.
  - 17. The ASR system of claim 11, wherein at least one of the at least two contexts comprises data indicating a current time when the current voice input is received at the ASR system.
  - 18. The ASR system of claim 11, wherein at least one of the at least two contexts associated with the current voice input is based on one or more prior voice inputs from the user within a past time period of the current voice input.
  - 19. The ASR system of claim 11, wherein at least one of the at least two contexts comprises named-entities associated with a particular category.
  - 20. The ASR system of claim 11, wherein the data processing hardware and the memory hardware reside on a server in communication with a computing device associated with the user, the computing device configured to capture the current voice input spoken by the user and transmit the current voice input to the ASR system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Aleksic, Petar, Mengibar, Pedro J. Moreno

Granted Patent

US 10,643,617 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/632   Query formulation

G10L 15/04   Segmentation; Word boundary...

G10L 15/183   using context dependencies,...

G10L 15/19   Grammatical context, e.g. d...

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 2015/085   Methods for reducing search...

VOICE RECOGNITION SYSTEM

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

1 Citation

20 Claims

Specification

Solutions

Use Cases

Quick Links

VOICE RECOGNITION SYSTEM

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

1 Citation

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links