Method and system for recognizing speech using wildcards in an expected response
First Claim
1. A method for recognizing speech in a speech recognition system, the method comprising the steps of:
- sensing, using a microphone, speech input and converting the sensed speech input to an electrical signal;
converting, using a signal processor comprising an analog-to-digital converter, the electrical signal to digital data;
receiving, using a computing device, the digital data, the computing device having at least one processor and a memory;
processing the digital data, using the processor, to produce acoustic features and acoustic data;
processing the acoustic features and the acoustic data using the processor and a library of models corresponding to hypothesis words and stored on the memory, to derive a hypothesis, the hypothesis comprising a sequence of hypothesis words;
assigning each hypothesis word a confidence score;
retrieving from the memory an expected response comprising a sequence of at least one expected word and at least one wildcard word;
comparing the hypothesis word-by-word to the expected response;
adjusting an acceptance threshold for each hypothesis word based on the results of the comparison;
comparing the confidence score assigned to a hypothesis word to its adjusted acceptance threshold and accepting or rejecting the hypothesis word based on the results of the comparison; and
if the hypothesis word is accepted, updating acoustic features and acoustic data of a model, in the library of models, corresponding to the hypothesis word using the acoustic features and acoustic data corresponding to the hypothesis word.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognition system used in a workflow receives and analyzes speech input to recognize and accept a user'"'"'s response to a task. Under certain conditions, a user'"'"'s response might be expected. In these situations, the expected response may modify the behavior of the speech recognition system to improve recognition accuracy. For example, if the hypothesis of a user'"'"'s response matches the expected response then there is a high probability that the user'"'"'s response was recognized correctly. An expected response may include expected words and wildcard words. Wildcard words represent any recognized word in a user'"'"'s response. By including wildcard words in the expected response, the speech recognition system may make modifications based on a wide range of user responses.
427 Citations
20 Claims
-
1. A method for recognizing speech in a speech recognition system, the method comprising the steps of:
-
sensing, using a microphone, speech input and converting the sensed speech input to an electrical signal; converting, using a signal processor comprising an analog-to-digital converter, the electrical signal to digital data; receiving, using a computing device, the digital data, the computing device having at least one processor and a memory; processing the digital data, using the processor, to produce acoustic features and acoustic data; processing the acoustic features and the acoustic data using the processor and a library of models corresponding to hypothesis words and stored on the memory, to derive a hypothesis, the hypothesis comprising a sequence of hypothesis words; assigning each hypothesis word a confidence score; retrieving from the memory an expected response comprising a sequence of at least one expected word and at least one wildcard word; comparing the hypothesis word-by-word to the expected response; adjusting an acceptance threshold for each hypothesis word based on the results of the comparison; comparing the confidence score assigned to a hypothesis word to its adjusted acceptance threshold and accepting or rejecting the hypothesis word based on the results of the comparison; and if the hypothesis word is accepted, updating acoustic features and acoustic data of a model, in the library of models, corresponding to the hypothesis word using the acoustic features and acoustic data corresponding to the hypothesis word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for recognizing speech in a speech recognition system, the method comprising the steps of:
-
sensing, using a microphone, speech input and converting the sensed speech input to an electrical signal; converting, using a signal processor comprising an analog-to-digital converter, the electrical signal to digital data; receiving, using a computing device, the digital data, the computing device having at least one processor and a memory; processing the digital data, using the processor, to produce acoustic features and acoustic data; deriving a hypothesis, using the processor running speech recognition algorithms, the acoustic features, the acoustic data, and a library of models corresponding to hypothesis words stored in the memory, the hypothesis comprising a sequence of hypothesis words; retrieving from the memory an expected response comprising a sequence of at least one expected word and at least one wildcard word; comparing, in sequence, each hypothesis word in the hypothesis to its corresponding expected word or wildcard word in the expected response; if the hypothesis word matches the corresponding expected word, then marking the hypothesis word as suitable for use in adaptation; adapting acoustic features and acoustic data of the models corresponding to hypothesis words marked suitable for adaptation using the acoustic features and the acoustic data corresponding to those hypothesis words; deriving a hypothesis word, using the processor running speech recognition algorithms and an adapted model of the adapted models stored in the memory; and comparing a confidence score assigned to the hypothesis word derived from the adapted model to an acceptance threshold and accepting or rejecting the hypothesis word based on the results of the comparison. - View Dependent Claims (10, 11)
-
-
12. A system for recognizing speech, comprising:
-
a microphone configured to sense speech input and convert the sensed speech input to an electrical signal; a signal processor configured to convert the electrical signal to digital data, the signal processor comprising an analog-to-digital converter; a computing device comprising a processor and a memory configured to execute (i) a recognition algorithm, (ii) a threshold-adjustment algorithm, and (iii) an acceptance algorithm, wherein; the recognition algorithm processes the digital data to produce acoustic features and acoustic data and assesses the acoustic features and the acoustic data using a library of models corresponding to hypothesis words stored in the memory to generate (i) a hypothesis comprising hypothesis words and (ii) a confidence score associated with one or more hypothesis words; the threshold-adjustment algorithm adjusts an acceptance threshold corresponding to a hypothesis word if the hypothesis matches an expected response stored in the memory, wherein the expected response comprises at least one expected word and at least one wildcard word; and the acceptance algorithm accepts a hypothesis word and updates acoustic features and acoustic data of a model, in the library of models, corresponding to the hypothesis word when the hypothesis word'"'"'s confidence score exceeds the hypothesis word'"'"'s acceptance threshold. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A system for recognizing speech, comprising:
-
a microphone configured to sense speech input and convert the sensed speech input to an electrical signal; a signal processor configured to convert the electrical signal to digital data, the signal processor comprising an analog-to-digital converter; a computing device comprising a processor and a memory configured to execute (i) a recognition algorithm, (ii) a model-update algorithm, and (iii) an acceptance algorithm, wherein; the recognition algorithm processes the digital data to produce acoustic features and acoustic data and assesses the acoustic features and the acoustic data using a library of models corresponding to hypothesis words stored in the memory to generate a hypothesis comprising hypothesis words; the model-update algorithm (i) compares the sequence of words of the hypothesis to an expected response stored in the memory, the expected response comprising expected words and at least one wildcard word, (ii) marks each hypothesis word that matches a corresponding expected word in the expected response as suitable for adaptation, and (iii) adapts acoustic features and acoustic data of a model for a hypothesis word marked suitable for adaptation using the acoustic features and the acoustic data corresponding to that hypothesis word; and the acceptance algorithm accepts a hypothesis word when the hypothesis word'"'"'s confidence score exceeds the hypothesis word'"'"'s acceptance threshold. - View Dependent Claims (18, 19, 20)
-
Specification