System and method enabling acoustic barge-in
First Claim
1. A method of suppressing speech recognition errors in a speech recognition system, said method comprising the steps of:
- receiving an input signal that comprises at least one user-generated command word and an echo from an outgoing system voice prompt, wherein at least one word of the outgoing system voice prompt is included in the echo received in the input signal;
generating an acoustic model of the outgoing system voice prompt, said acoustic prompt model mathematically representing the words of the outgoing system voice prompt;
supplying the input signal to a speech recognizer having an acoustic model of a target vocabulary, said acoustic target vocabulary model mathematically representing at least one user-generated command word;
comparing the input signal to the acoustic prompt model and to the acoustic target vocabulary model;
determining which of the acoustic prompt model and the acoustic target vocabulary model provides a best match for the input signal during the comparing step;
accepting the best match if the acoustic target vocabulary model provides the best match; and
ignoring the best match if the acoustic prompt model provides the best match.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method enabling acoustic barge-in during a voice prompt in a communication system. An acoustic prompt model is trained to represent the system prompt using the specific speech signal of the prompt. The acoustic prompt model is utilized in a speech recognizer in parallel with the recognizer'"'"'s active vocabulary words to suppress the echo of the prompt within the recognizer. The speech recognizer may also use a silence model and traditional garbage models such as noise models and out-of-vocabulary word models to reduce the likelihood that noises and out-of-vocabulary words in the user utterance will be mapped erroneously onto active vocabulary words.
-
Citations
22 Claims
-
1. A method of suppressing speech recognition errors in a speech recognition system, said method comprising the steps of:
-
receiving an input signal that comprises at least one user-generated command word and an echo from an outgoing system voice prompt, wherein at least one word of the outgoing system voice prompt is included in the echo received in the input signal; generating an acoustic model of the outgoing system voice prompt, said acoustic prompt model mathematically representing the words of the outgoing system voice prompt; supplying the input signal to a speech recognizer having an acoustic model of a target vocabulary, said acoustic target vocabulary model mathematically representing at least one user-generated command word; comparing the input signal to the acoustic prompt model and to the acoustic target vocabulary model; determining which of the acoustic prompt model and the acoustic target vocabulary model provides a best match for the input signal during the comparing step; accepting the best match if the acoustic target vocabulary model provides the best match; and ignoring the best match if the acoustic prompt model provides the best match. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of suppressing speech recognition errors and improving word accuracy in a speech recognition system that enables a user of a communication device to interrupt an outgoing system voice prompt with user-generated command words that halt the outgoing voice prompt and initiate desired actions, said method comprising the steps of:
-
generating an acoustic model of the outgoing system voice prompt, said acoustic prompt model mathematically representing the words of the outgoing system voice prompt; storing the acoustic prompt model in a speech recognizer; storing an acoustic target vocabulary model in the speech recognizer, said acoustic target vocabulary model including models of a plurality of user-generated command words; supplying an input signal to a comparer in the speech recognizer, said input signal including at least one user-generated command word and an echo from the outgoing system voice prompt, wherein at least one word of the outgoing system voice prompt is included in the echo in the input signal; comparing the input signal to the acoustic target vocabulary model and the acoustic prompt model to identify which model provides a best match for the input signal; ignoring the best match if the acoustic prompt model provides the best match; accepting the best match if the acoustic target vocabulary model provides the best match; supplying to an action table, any command word corresponding to the best match provided by the acoustic target vocabulary model; identifying from the action table, an action corresponding to the supplied command word; halting the outgoing system voice prompt; and initiating the identified action.
-
-
11. A speech recognizer for recognizing input command words while suppressing speech recognition errors, said speech recognizer comprising:
-
means for receiving an input signal that comprises incoming user input speech and an echo from an outgoing system voice prompt, wherein at least one word of the outgoing system voice prompt is included in the echo received in the input signal; an acoustic vocabulary model that mathematically represents at least one command word; an acoustic prompt model that mathematically represents the words of the outgoing system voice prompt; and a comparer that receives the input signal and compares the input signal to the acoustic vocabulary model and to the acoustic prompt model to determine which model provides a best match for the input signal, said comparer accepting the best match if the acoustic target vocabulary model provides the best match, and ignoring the best match if the acoustic prompt model provides the best match. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
-
19. A speech recognition system for suppressing speech recognition errors and improving word accuracy, said system enabling a user of a communication device to interrupt an outgoing system voice prompt with user-generated command words that halt the outgoing system voice prompt and initiate desired actions, said system comprising:
-
means for generating an acoustic model of the outgoing system voice prompt, said acoustic prompt model mathematically representing the words of the outgoing system voice prompt; an acoustic vocabulary model comprising mathematical models of a plurality of user-generated command words; a comparer for receiving an input signal and comparing the input signal to the acoustic vocabulary model and to the acoustic prompt model to determine which model provides a best match for the input signal, said input signal including at least one user-generated command word and an echo from the outgoing system voice prompt, wherein at least one word of the outgoing system voice prompt is included in the echo in the input signal, said comparer accepting the best match if the acoustic target vocabulary model provides the best match, and ignoring the best match if the acoustic prompt model provides the best match; and an action table that receives a command word from the comparer upon a determination by the comparer that the acoustic target vocabulary model provides the best match, said action table associating the received command word with a corresponding action, and notifying an associated network to initiate the corresponding action, and to halt the outgoing system voice prompt. - View Dependent Claims (20, 21, 22)
-
Specification