Methods and apparatus for generating, updating and distributing speech recognition models

US 8,447,599 B2
Filed: 12/30/2011
Issued: 05/21/2013
Est. Priority Date: 11/30/2000
Status: Expired due to Term

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, by a speech recognition model training system and from a first computing device, (i) a request for an update to a speaker-dependent speech recognition model associated with the first computing device, and (ii) recorded speech;

generating, based on processing the recorded speech from the first computing device and by the speech recognition model training system, the update to the speaker-dependent speech recognition model associated with the first computing device;

generating, based on processing the recorded speech from the first computing device and by the speech recognition model training system, an update for a speaker-independent speech recognition model associated with a second computing device;

transmitting, based on receiving the request and by the speech recognition model training system, the update to the speaker-dependent speech recognition model to the first computing device;

determining, by the speech recognition model training system, that a predetermined period of time has elapsed since the speaker-independent speech recognition model associated with the second computing device was last updated; and

transmitting, based on determining that the predetermined period of time has elapsed since the speaker-independent speech recognition model associated with the second computing device was last updated and by the speech recognition model training system, the update to the speaker-independent speech recognition model to the second computing device.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device'"'"'s speech processing capability. Voice dialing, telephone control and/or other services are provided by the speech processing facility in response to speech recognition results.

Citations

20 Claims

1. A computer-implemented method comprising:
- receiving, by a speech recognition model training system and from a first computing device, (i) a request for an update to a speaker-dependent speech recognition model associated with the first computing device, and (ii) recorded speech;
  
  generating, based on processing the recorded speech from the first computing device and by the speech recognition model training system, the update to the speaker-dependent speech recognition model associated with the first computing device;
  
  generating, based on processing the recorded speech from the first computing device and by the speech recognition model training system, an update for a speaker-independent speech recognition model associated with a second computing device;
  
  transmitting, based on receiving the request and by the speech recognition model training system, the update to the speaker-dependent speech recognition model to the first computing device;
  
  determining, by the speech recognition model training system, that a predetermined period of time has elapsed since the speaker-independent speech recognition model associated with the second computing device was last updated; and
  
  transmitting, based on determining that the predetermined period of time has elapsed since the speaker-independent speech recognition model associated with the second computing device was last updated and by the speech recognition model training system, the update to the speaker-independent speech recognition model to the second computing device.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, further comprising receiving, by the speech recognition model training system, a transcription of the recorded speech.
  - 3. The method of claim 2, further comprising receiving, by the speech recognition model training system, the speaker-dependent speech recognition model corresponding to the recorded speech.
  - 4. The method of claim 3, further comprising receiving, by the speech recognition model training system, a user identifier associated with the first computing device.
  - 5. The method of claim 1, further comprising receiving, by the speech recognition model training system, an extracted set of feature information corresponding to the recorded speech.
  - 6. The method of claim 5, wherein the extracted set of feature information includes one or more of timing, duration, and amplitude of the recorded speech, and includes digitization of the recorded speech.
  - 7. The method of claim 6, wherein the extracted set of feature information includes changes in the one or more of timing, duration, and amplitude of the recorded speech.

8. A non-transitory computer-readable storage medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- receiving, by a speech recognition model training system and from a first computing device, (i) a request for an update to a speaker-dependent speech recognition model associated with the first computing device, and (ii) recorded speech;
  
  generating, based on processing the recorded speech from the first computing device and by the speech recognition model training system, the update to the speaker-dependent speech recognition model associated with the first computing device;
  
  generating, based on processing the recorded speech from the first computing device and by the speech recognition model training system, an update for a speaker-independent speech recognition model associated with a second computing device;
  
  transmitting, based on receiving the request and by the speech recognition model training system, the update to the speaker-dependent speech recognition model to the first computing device;
  
  determining, by the speech recognition model training system, that a predetermined period of time has elapsed since the speaker-independent speech recognition model associated with the second computing device was last updated; and
  
  transmitting, based on determining that the predetermined period of time has elapsed since the speaker-independent speech recognition model associated with the second computing device was last updated and by the speech recognition model training system, the update to the speaker-independent speech recognition model to the second computing device.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The medium of claim 8, further comprising operations of receiving, by the speech recognition model training system, a transcription of the recorded speech.
  - 10. The medium of claim 8, further comprising operations of receiving, by the speech recognition model training system, the speaker-dependent speech recognition model corresponding to the recorded speech.
  - 11. The medium of claim 10, further comprising operations of receiving, by the speech recognition model training system, a user identifier of the first computing device.
  - 12. The medium of claim 8, further comprising operations of receiving, by the speech recognition model training system, an extracted set of feature information corresponding to the recorded speech.
  - 13. The medium of claim 12, wherein the extracted set of feature information includes one or more of timing, duration, and amplitude of the recorded speech, and includes digitization of the recorded speech.
  - 14. The medium of claim 12, wherein the extracted set of feature information includes changes in the one or more of timing, duration, and amplitude of the recorded speech.

15. A speech processing system, comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or computers to perform operations comprising;
  
  receiving, by a speech recognition model training system and from a first computing device, (i) a request for an update to a speaker-dependent speech recognition model associated with the first computing device, and (ii) recorded speech;
  
  generating, based on processing the recorded speech from the first computing device and by the speech recognition model training system, the update to the speaker-dependent speech recognition model associated with the first computing device;
  
  generating, based on processing the recorded speech from the first computing device and by the speech recognition model training system, an update for a speaker-independent speech recognition model associated with a second computing device;
  
  transmitting, based on receiving the request and by the speech recognition model training system, the update to the speaker-dependent speech recognition model to the first computing device;
  
  determining, by the speech recognition model training system, that a predetermined period of time has elapsed since the speaker-independent speech recognition model associated with the second computing device was last updated; and
  
  transmitting, based on determining that the predetermined period of time has elapsed since the speaker-independent speech recognition model associated with the second computing device was last updated and by the speech recognition model training system, the update to the speaker-independent speech recognition model to the second computing device.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The system of claim 15, further comprising operations of receiving, by the speech recognition model training system, a transcription of the recorded speech.
  - 17. The system of claim 16, further comprising operations of receiving, by the speech recognition model training system, the speaker-dependent speech recognition model corresponding to the recorded speech.
  - 18. The system of claim 17, further comprising operations of receiving, by the speech recognition model training system, a user identifier of the first computing device.
  - 19. The system of claim 15, further comprising operations of receiving, by the speech recognition model training system, the speaker-dependent speech recognition model corresponding to the recorded speech.
  - 20. The system of claim 19, further comprising operations of receiving, by the speech recognition model training system, an extracted set of feature information that includes one or more of timing, duration, and amplitude of the recorded speech, and changes in one or more of timing, duration, and amplitude of the recorded speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Reding, Craig, Levas, Suzi
Primary Examiner(s)
Han, Qi

Application Number

US13/340,954
Publication Number

US 20120101812A1
Time in Patent Office

508 Days
Field of Search

704/201, 704/231, 704/235, 704/251, 704/270
US Class Current

704/231
CPC Class Codes

G10L 15/00   Speech recognition G10L17/0...

G10L 15/063   Training

G10L 15/30   Distributed recognition, e....

Methods and apparatus for generating, updating and distributing speech recognition models

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and apparatus for generating, updating and distributing speech recognition models

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links