Targeted detection of regions in speech processing data streams
First Claim
Patent Images
1. A method of identifying speech processing results for further processing performed by a speech recognition processing component, the method comprising:
- obtaining a model for a trigger word, wherein the trigger word identifies audio data for further processing;
receiving first audio data comprising first speech from a user;
performing speech processing on the first audio data to obtain speech processing results;
returning the speech processing results to the user;
receiving second audio data comprising second speech from the user;
determining that the second audio data comprises the trigger word using the trigger word model;
after the determining, creating an indicator identifying at least one of the first audio data and the speech processing results for further processing by a speech recognition training component; and
sending the indicator, the first audio data and the speech processing results to the speech recognition training component.
1 Assignment
0 Petitions
Accused Products
Abstract
In speech processing systems, a special audio trigger indication is configured to efficiently isolate and mark incorrect speech processing results. The trigger indication may be configured to be easily recognizable by a speech processing device under various ASR and acoustic conditions. Once a speech processing device recognizes the trigger indication, incorrectly processed speech processing results are marked and may be isolated and prioritized for review by training and upgrading processes.
31 Citations
23 Claims
-
1. A method of identifying speech processing results for further processing performed by a speech recognition processing component, the method comprising:
-
obtaining a model for a trigger word, wherein the trigger word identifies audio data for further processing; receiving first audio data comprising first speech from a user; performing speech processing on the first audio data to obtain speech processing results; returning the speech processing results to the user; receiving second audio data comprising second speech from the user; determining that the second audio data comprises the trigger word using the trigger word model; after the determining, creating an indicator identifying at least one of the first audio data and the speech processing results for further processing by a speech recognition training component; and sending the indicator, the first audio data and the speech processing results to the speech recognition training component. - View Dependent Claims (2, 3, 4)
-
-
5. A method performed by a speech recognition processing component, the method comprising:
-
transmitting speech processing results, wherein the speech processing results were determined from first audio data; receiving second audio data indicating that the speech processing results are incorrect; identifying third audio data as corresponding to incorrect speech processing results, wherein the third audio data includes at least a portion of the first audio data; after the identifying, creating an indicator identifying the third audio data for further processing by a speech recognition training component; and sending the third audio data, the indicator and the speech processing results to the speech recognition training component. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A computing device configured to perform speech recognition processing, the computing device comprising:
-
a processor; a memory device including instructions operable to be executed by the processor to perform a set of actions, configuring the processor; to transmit speech processing results, wherein the speech processing results were determined based at least in part on first audio data; to receive second audio indicating that the speech recognition results are incorrect; to identify third audio data as corresponding to incorrect speech recognition results; to, after identifying the third audio data, create an indicator identifying the third audio data for further processing by a speech recognition training component; and to send the third audio data, the indicator and the speech processing results to the speech recognition training component. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A non-transitory computer-readable storage medium storing processor-executable instructions for controlling a computing device configured to perform speech recognition processing, the storage medium comprising:
-
program code to transmit speech processing results to a user, wherein the speech processing results were determined based at least in part on first audio data; program code to receive second audio indicating that the speech recognition results are incorrect; program code to identify third audio data as corresponding to incorrect speech recognition results; program code to, after identifying the third audio data, create an indicator identifying the third audio data for further processing by a speech recognition training component; and program code to send the third audio data, the indicator and the speech processing results to the speech recognition training component. - View Dependent Claims (19, 20, 21, 22, 23)
-
Specification