Speech recognition system with network accessible speech processing resources
First Claim
Patent Images
1. A method of speech recognition comprising the acts of:
- receiving speech signals into a front-end processor;
storing at least some speaker-dependent and/or speaker group-dependent resources used for speech recognition in a network-attached server, including resources that implement a mapping between the speech signals and tokens that have mean ing to a voice enabled application;
coupling the front-end processor to the network-attached server over a network to perform the speech recognition using the resources.
2 Assignments
0 Petitions
Accused Products
Abstract
A method of speech recognition including receiving speech signals into a front-end processor and storing at least some resources used for speech recognition in a network-attached server. The front-end processor is coupled to the network-attached server to perform the speech recognition.
-
Citations
28 Claims
-
1. A method of speech recognition comprising the acts of:
-
receiving speech signals into a front-end processor;
storing at least some speaker-dependent and/or speaker group-dependent resources used for speech recognition in a network-attached server, including resources that implement a mapping between the speech signals and tokens that have mean ing to a voice enabled application;
coupling the front-end processor to the network-attached server over a network to perform the speech recognition using the resources. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
transmitting the speech signal to the network-attached server; and
performing speech recognition of the speech signal in the network-attached processor.
-
-
6. The method of claim 5 further comprising the acts of:
returning a response from the network-attached server over the network to the front-end processor comprising phoneme probabilities corresponding to the speech signal.
-
7. The method of claim 5 further comprising the acts of:
returning a response from the network-attached server to the front-end processor comprising text corresponding to the speech signal.
-
8. The method of claim 1 wherein the front-end processor comprises a trainable neural network and the stored resources comprise a neural network training file.
-
9. The method of claim 1 wherein the stored resources comprise voice signal:
- value pairs associating a particular value with each voice signal.
-
10. A speech recognition server comprising:
-
a network interface configured to receive a request from an external entity;
an identification of a speaker associated with each request;
speaker-dependent signature data structures stored in the speech recognition server; and
means for generating a response including speaker-dependent voice recognition resources in response to the received request, wherein the voice recognition resources are used by the external entity to perform speech recognition. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
an interface for receiving a feedback message corresponding to a response, the feedback message indicating whether the voice recognition resources supplied in the response were useful; and
processes executing within the server for adapting the speaker dependent signature data structures in response to the feedback message.
-
-
18. A speech recognition system comprising:
-
a centralized resource of shared, speaker-dependent speech recognition resources;
two or more applications having processes for receiving a voice signal and communicating with the centralized resource over a network to perform speech recognition on the voice signal using the speech recognition resources from the centralized resource. - View Dependent Claims (19, 20, 21, 22, 23, 24)
a feedback message generated by one of the applications to the centralized resource, the feedback message indicating efficacy of the shared speech recognition resource; and
processes within the centralized resource operable to use the feedback message to adapt the shared speech recognition resource to improve efficacy.
-
-
25. A speech-enabled software application comprising:
-
a first interface for receiving a voice signal from a speaker;
a second interface for sending the voice signal over a network to a centralized speech recognition server;
a third interface for receiving phoneme probabilities from the speech recognition server corresponding to the voice signal; and
processes for using the phoneme probabilities to launch speech-enabled functions for the speaker.
-
-
26. A method of creating a speech sample database comprising:
-
accepting a voice recognition task at an application;
communicating the voice recognition task to a centralized resource;
performing the voice recognition task at a the centralized resource;
causing the application to evaluate correctness of the voice recognition; and
storing the speech sample from the task with its recognition result in a speech sample database. - View Dependent Claims (27, 28)
-
Specification