Automated learning for speech-based applications
First Claim
1. A method comprising:
- receiving, by a computer-based speech recognition system that is communicatively coupled to a communication network, a speech input including one or more words or phrases, the computer-based speech recognition system comprising at least one processor and at least one memory device, wherein the speech input is from a call to a call center via the communication network;
determining, by the computer-based speech recognition system, to provide the speech input to a human;
receiving a response from the human that identifies a first task for the speech input, the first task including a first transcription of the speech input;
performing the first task that is identified by the human for the speech input;
processing the speech input to identify a second task for the speech input, the processing using a set of internal representations of the computer-based speech recognition system, the set of internal representations comprising one or more machine-readable parameters for recognizing speech in a speech utterance, the second task including a second transcription of the speech input;
comparing the first transcription of the speech input included in the first task identified by the human with the second transcription of the speech input included in the second task identified by the computer-based recognition system to determine one or more differences between the first transcription and the second transcription;
modifying the set of internal representations of the computer-based speech recognition system based at least in part on the one or more differences between the first transcription and the second transcription to create a modified set of internal representations, wherein the modifying includes adjusting at least a portion of the set of internal representations;
checking the performance of the modified set of internal representations to prevent the modification from degrading the set internal representations, wherein the checking comprises determining that a performance difference between the set of internal representations before and after modification is within a margin of error;
receiving, by the computer-based speech recognition system, another input; and
processing, by the at least one processor and based at least in part on the modified set of internal representations, the other input to identify a third task for the other input.
3 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for modifying a computer-based speech recognition system. A speech utterance is processed with the computer-based speech recognition system using a set of internal representations, which may comprise parameters for recognizing speech in a speech utterance, such as parameters of an acoustic model and/or a language model. The computer-based speech recognition system may perform a first task in response to the processed speech utterance. The utterance may also be provided to a human who performs a second task based on the utterance. Data indicative of the first task, performed by the computer system, is compared to data indicative of a second task, performed by the human in response to the speech utterance. Based on the comparison, the set of internal representations may be updated or modified to improve the speech recognition performance and capabilities of the speech recognition system.
26 Citations
19 Claims
-
1. A method comprising:
-
receiving, by a computer-based speech recognition system that is communicatively coupled to a communication network, a speech input including one or more words or phrases, the computer-based speech recognition system comprising at least one processor and at least one memory device, wherein the speech input is from a call to a call center via the communication network; determining, by the computer-based speech recognition system, to provide the speech input to a human; receiving a response from the human that identifies a first task for the speech input, the first task including a first transcription of the speech input; performing the first task that is identified by the human for the speech input; processing the speech input to identify a second task for the speech input, the processing using a set of internal representations of the computer-based speech recognition system, the set of internal representations comprising one or more machine-readable parameters for recognizing speech in a speech utterance, the second task including a second transcription of the speech input; comparing the first transcription of the speech input included in the first task identified by the human with the second transcription of the speech input included in the second task identified by the computer-based recognition system to determine one or more differences between the first transcription and the second transcription; modifying the set of internal representations of the computer-based speech recognition system based at least in part on the one or more differences between the first transcription and the second transcription to create a modified set of internal representations, wherein the modifying includes adjusting at least a portion of the set of internal representations; checking the performance of the modified set of internal representations to prevent the modification from degrading the set internal representations, wherein the checking comprises determining that a performance difference between the set of internal representations before and after modification is within a margin of error; receiving, by the computer-based speech recognition system, another input; and processing, by the at least one processor and based at least in part on the modified set of internal representations, the other input to identify a third task for the other input. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-based speech recognition system, comprising:
-
one or more processors communicatively coupled to a communication network; and memory storing instructions that, when executed by the one or more processors, cause the computer-based speech recognition system to perform acts comprising; obtaining a speech input including one or more words or phrases, wherein the speech input is from a call to a call center via the communication network; determining to provide the speech input to a human; receiving a response from the human that identifies a first task for the input, the first task including a first transcription of the speech input; processing the speech input to identify a second task for the speech input, the processing using a set of internal representations of the computer-based speech recognition system, the set of internal representations comprising one or more machine-readable parameters for recognizing speech in a speech utterance, the second task including a second transcription of the speech input; comparing the speech input and the first transcription of the speech input included in the first task with the second transcription of the speech input included in the second task, the set of internal representations comprising one or more machine-readable parameters for recognizing speech in a speech utterance; modifying the set of internal representations of the computer-based speech recognition system based at least in part on the comparing the first transcription with the second transcription to create a modified set of internal representations, wherein the modifying includes adjusting at least a portion of the set of internal representations; checking the performance of the modified set of internal representations to prevent the modification from degrading the set internal representations, wherein the checking comprises determining that a performance difference between the set of internal representations before and after modification is within a margin of error; receiving, by the computer-based speech recognition system, another input; and processing, by the one or more processors and based at least in part on the modified set of internal representations, the other input to identify a third task for the other input. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. One or more non-transitory computer-readable storage media storing instructions that, when executed by one or more processors, communicatively coupled to a communication network, configure the one or more processors to perform acts comprising:
-
receiving, by a computer-based speech recognition system, a speech input including one or more words or phrases, wherein the speech input is from a call to a call center via the communication network; determining, by the computer-based speech recognition system, to provide the speech input to a human; receiving a response from the human that identifies a first task for the speech input, the first task including a first transcription of the speech input; processing the speech input to identify a second task for the speech input, the processing using a set of internal representations of the computer-based speech recognition system, the set of internal representations comprising one or more machine-readable parameters for recognizing speech in a speech utterance, the second task including a second transcription of the speech input; comparing the speech input and the first transcription of the speech input included in the first task with the second transcription of the speech input included in the second task, the set of internal representations comprising one or more machine-readable parameters for recognizing speech in a speech utterance; modifying the set of internal representations of the computer-based speech recognition system based at least in part on the comparing the first transcription with the second transcription to create a modified set of internal representations, wherein the modifying includes adjusting at least a portion of the set of internal representations; checking the performance of the modified set of internal representations to prevent the modification from degrading the set internal representations, wherein the checking comprises determining that a performance difference between the set of internal representations before and after modification is within a margin of error; receiving, by the computer-based speech recognition system, another input; and processing, by the one or more processors and based at least in part on the modified set of internal representations, the other input to identify a third task for the other input. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
Specification