Performing speech recognition over a network and using speech recognition results
First Claim
1. A computer-implemented method comprising:
- receiving, by a client device, speech data;
initiating, by the client device, a local voice dialing operation using the speech data, wherein initiating the local voice dialing operation comprises;
extracting, by a feature extraction module on the client device, one or more features from the speech data, andperforming, by a speech recognition module on the client device, a speech recognition operation on the extracted features using one or more locally stored speech recognition models;
determining, by the client device, that the local voice dialing operation was unsuccessful;
in response to determining that the local voice dialing operation was unsuccessful, transmitting, by the client device and to a remote speech processing system, a request to perform a remote voice dialing operation, wherein the request includes the speech data or the extracted features; and
receiving, by the client device and from the remote speech processing system, a response to the request.
4 Assignments
0 Petitions
Accused Products
Abstract
Systems, methods and apparatus for generating, distributing, and using speech recognition models. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. The speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device'"'"'s speech processing capability. Voice dialing, telephone control and/or other services are provided by the speech processing facility in response to speech recognition results.
55 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
receiving, by a client device, speech data; initiating, by the client device, a local voice dialing operation using the speech data, wherein initiating the local voice dialing operation comprises; extracting, by a feature extraction module on the client device, one or more features from the speech data, and performing, by a speech recognition module on the client device, a speech recognition operation on the extracted features using one or more locally stored speech recognition models; determining, by the client device, that the local voice dialing operation was unsuccessful; in response to determining that the local voice dialing operation was unsuccessful, transmitting, by the client device and to a remote speech processing system, a request to perform a remote voice dialing operation, wherein the request includes the speech data or the extracted features; and receiving, by the client device and from the remote speech processing system, a response to the request. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, if executed by the one or more computers, cause the one or more computers to perform operations comprising; receiving, by a client device, speech data; initiating, by the client device, a local voice dialing operation using the speech data, wherein initiating the local voice dialing operation comprises; extracting, by a feature extraction module on the client device, one or more features from the speech data, and performing, by a speech recognition module on the client device, a speech recognition operation on the extracted features using one or more locally stored speech recognition models; determining, by the client device, that the local voice dialing operation was unsuccessful; in response to determining that the local voice dialing operation was unsuccessful, transmitting, by the client device and to a remote speech processing system, a request to perform a remote voice dialing operation, wherein the request includes the speech data or the extracted features; and receiving, by the client device and from the remote speech processing system, a response to the request. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
20. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, if executed, cause the one or more computers to perform operations comprising:
-
receiving, by a client device, speech data; initiating, by the client device, a local voice dialing operation using the speech data, wherein initiating the local voice dialing operation comprises; extracting, by a feature extraction module on the client device, one or more features from the speech data, and performing, by a speech recognition module on the client device, a speech recognition operation on the extracted features using one or more locally stored speech recognition models; determining, by the client device, that the local voice dialing operation was unsuccessful; in response to determining that the local voice dialing operation was unsuccessful, transmitting, by the client device and to a remote speech processing system, a request to perform a remote voice dialing operation, wherein the request includes the speech data or the extracted features; and receiving, by the client device and from the remote speech processing system, a response to the request.
-
Specification