Performing speech recognition over a network and using speech recognition results

US 8,335,687 B1
Filed: 03/19/2012
Issued: 12/18/2012
Est. Priority Date: 11/30/2000
Status: Expired due to Term

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, by a speech recognition model training system and from a client device, a request to generate a new speech recognition model for the client device, wherein the request includes;

(i) one or more features extracted from speech data by a feature extractor on the client device, and(ii) metadata regarding the speech recognition model to be generated;

generating, by the speech recognition model training system and using the one or more features extracted from speech data by the feature extractor on the client device, the new speech recognition model according to the metadata; and

transmitting, by the speech recognition model training system and to the client device, at least a portion of the new speech recognition model.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems, methods and apparatus for generating, distributing, and using speech recognition models. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. The speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device'"'"'s speech processing capability. Voice dialing, telephone control and/or other services are provided by the speech processing facility in response to speech recognition results.

Citations

19 Claims

1. A computer-implemented method comprising:
- receiving, by a speech recognition model training system and from a client device, a request to generate a new speech recognition model for the client device, wherein the request includes;
  
  (i) one or more features extracted from speech data by a feature extractor on the client device, and(ii) metadata regarding the speech recognition model to be generated;
  
  generating, by the speech recognition model training system and using the one or more features extracted from speech data by the feature extractor on the client device, the new speech recognition model according to the metadata; and
  
  transmitting, by the speech recognition model training system and to the client device, at least a portion of the new speech recognition model.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein the metadata includes an identifier identifying a user of the client device, and wherein generating the new speech recognition model according to the metadata comprises generating a new speaker-dependent speech recognition model for the user of the client device.
  - 3. The method of claim 1, wherein the metadata includes a textual representation of a word or phrase included in the speech data.
  - 4. The method of claim 1, wherein the metadata includes speech recognition model type information, and wherein generating the new speech recognition model according to the metadata comprises generating a type of speech of speech recognition model identified by the speech recognition model type information.
  - 5. The method of claim 1, wherein the speech recognition model type information comprises data that specifies that the speech recognition to be generated is a word processing-type speech recognition model or a voice dialing-type speech recognition model.
  - 6. The method of claim 1, wherein the metadata identifies an existing speech recognition model to be updated.
  - 7. The method of claim 1, wherein the request further includes the speech data.
  - 8. The method of claim 7, comprising:
    - extracting, by a feature extractor on the speech recognition model training system, one or more features from the speech data,wherein the speech recognition model is generated using the one or more features extracted from the speech data by the feature extractor on the client device and the one or more features extracted from the speech data by the feature extractor on the speech recognition model training system.
  - 9. The method of claim 1, wherein generating a new speech recognition model includes updating an existing speech recognition model.

10. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  receiving, by a speech recognition model training system and from a client device, a request to generate a new speech recognition model for the client device, wherein the request includes;
  
  (i) one or more features extracted from speech data by a feature extractor on the client device, and(ii) metadata regarding the speech recognition model to be generated;
  
  generating, by the speech recognition model training system and using the one or more features extracted from speech data by the feature extractor on the client device, the new speech recognition model according to the metadata; and
  
  transmitting, by the speech recognition model training system and to the client device, at least a portion of the new speech recognition model.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The system of claim 10, wherein the metadata includes an identifier identifying a user of the client device, and wherein generating the new speech recognition model according to the metadata comprises generating a new speaker-dependent speech recognition model for the user of the client device.
  - 12. The system of claim 10, wherein the metadata includes a textual representation of a word or phrase included in the speech data.
  - 13. The system of claim 10, wherein the metadata includes speech recognition model type information, and wherein generating the new speech recognition model according to the metadata comprises generating a type of speech of speech recognition model identified by the speech recognition model type information.
  - 14. The system of claim 10, wherein the speech recognition model type information comprises data that specifies that the speech recognition to be generated is a word processing-type speech recognition model or a voice dialing-type speech recognition model.
  - 15. The system of claim 10, wherein the metadata identifies an existing speech recognition model to be updated.
  - 16. The system of claim 10, wherein the request further includes the speech data.
  - 17. The system of claim 16, wherein the operations comprise:
    - extracting, by a feature extractor on the speech recognition model training system, one or more features from the speech data,wherein the speech recognition model is generated using the one or more features extracted from the speech data by the feature extractor on the client device and the one or more features extracted from the speech data by the feature extractor on the speech recognition model training system.
  - 18. The system of claim 10, wherein generating a new speech recognition model includes updating an existing speech recognition model.

19. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- receiving, by a speech recognition model training system and from a client device, a request to generate a new speech recognition model for the client device, wherein the request includes;
  
  (i) one or more features extracted from speech data by a feature extractor on the client device, and(ii) metadata regarding the speech recognition model to be generated;
  
  generating, by the speech recognition model training system and using the one or more features extracted from speech data by the feature extractor on the client device, the new speech recognition model according to the metadata; and
  
  transmitting, by the speech recognition model training system and to the client device, at least a portion of the new speech recognition model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Reding, Craig L., Levas, Suzi
Primary Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US13/423,595
Time in Patent Office

274 Days
Field of Search

704/231
US Class Current

704/231
CPC Class Codes

G10L 13/08   Text analysis or generation...

G10L 15/02   Feature extraction for spee...

G10L 15/063   Training

G10L 15/08   Speech classification or se...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 17/04   Training, enrolment or mode...

G10L 2015/221   Announcement of recognition...

G10L 2015/223   Execution procedure of a sp...

H04M 2201/40   using speech recognition

H04M 2207/18   wireless networks

H04M 3/42204   Arrangements at the exchang...

Performing speech recognition over a network and using speech recognition results

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Performing speech recognition over a network and using speech recognition results

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links