Performing speech recognition over a network and using speech recognition results
First Claim
1. A computer-implemented method comprising:
- receiving, by a speech recognition model training system and from a client device, a request to generate a new speech recognition model for the client device, wherein the request includes;
(i) one or more features extracted from speech data by a feature extractor on the client device, and(ii) metadata regarding the speech recognition model to be generated;
generating, by the speech recognition model training system and using the one or more features extracted from speech data by the feature extractor on the client device, the new speech recognition model according to the metadata; and
transmitting, by the speech recognition model training system and to the client device, at least a portion of the new speech recognition model.
4 Assignments
0 Petitions
Accused Products
Abstract
Systems, methods and apparatus for generating, distributing, and using speech recognition models. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. The speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device'"'"'s speech processing capability. Voice dialing, telephone control and/or other services are provided by the speech processing facility in response to speech recognition results.
-
Citations
19 Claims
-
1. A computer-implemented method comprising:
-
receiving, by a speech recognition model training system and from a client device, a request to generate a new speech recognition model for the client device, wherein the request includes; (i) one or more features extracted from speech data by a feature extractor on the client device, and (ii) metadata regarding the speech recognition model to be generated; generating, by the speech recognition model training system and using the one or more features extracted from speech data by the feature extractor on the client device, the new speech recognition model according to the metadata; and transmitting, by the speech recognition model training system and to the client device, at least a portion of the new speech recognition model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving, by a speech recognition model training system and from a client device, a request to generate a new speech recognition model for the client device, wherein the request includes; (i) one or more features extracted from speech data by a feature extractor on the client device, and (ii) metadata regarding the speech recognition model to be generated; generating, by the speech recognition model training system and using the one or more features extracted from speech data by the feature extractor on the client device, the new speech recognition model according to the metadata; and transmitting, by the speech recognition model training system and to the client device, at least a portion of the new speech recognition model. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving, by a speech recognition model training system and from a client device, a request to generate a new speech recognition model for the client device, wherein the request includes; (i) one or more features extracted from speech data by a feature extractor on the client device, and (ii) metadata regarding the speech recognition model to be generated; generating, by the speech recognition model training system and using the one or more features extracted from speech data by the feature extractor on the client device, the new speech recognition model according to the metadata; and transmitting, by the speech recognition model training system and to the client device, at least a portion of the new speech recognition model.
-
Specification