Speech recognition using ambiguous or phone key spelling and/or filtering
First Claim
1. A method of performing large vocabulary speech recognition comprising:
- receiving a filtering sequence of one or more key-press signals each of which indicates which of a plurality of keys has been selected by a user, where each of the keys represents two or more letters;
receiving an acoustic representation of a key-disambiguating utterance made in association with a given key press signal in said filtering sequence;
performing speech recognition upon the acoustic representation of the key-disambiguation utterance that favors recognition of letter identifying words identifying letters represented by the given key press signal;
responding to a recognition of the given key press signal'"'"'s associated key-disambiguation utterance as a letter identifying word by causing the set of letters represented by the given key press signal in the filtering sequence to be substantially limited to a letter identified by the recognized letter identifying word;
receiving an acoustic representation of a word utterance that represents one or more words;
performing speech recognition upon the acoustic word utterance representation which scores word candidates as a function of the match between the acoustic representation and acoustic models of words;
wherein the scoring of said word candidates favors word candidates containing a sequence of one or more alphabetic characters corresponding to the filtering sequence of key-press signals, where a candidate word is considered to contain a character sequence corresponding to the filtering sequence if each sequential character in the character sequence corresponds to one of the letters represented by its corresponding sequential key-press signal;
wherein said method further includes;
responding to a key press signal by displaying in user-perceivable form a set of one or more letter identifying words starting with each letter represented by the key press signal'"'"'s associated pressed key;
favoring the recognition of an utterance made after the display of the pressed key'"'"'s associated letter identifying words as corresponding to one of said displayed words; and
responding to recognition of one of said displayed words by said causing the set of letters represented by the key press signal in the filtering sequence to be substantially limited to the letter associated with the recognized displayed word.
8 Assignments
0 Petitions
Accused Products
Abstract
Alphabetic filtering of the speech recognition of words uses a key press to indicate a desired character in an alphabetic filter string, where each key press represents two or more letters. The key presses can be disambiguated by recognizing a key-disambiguation utterance in association with a given key press. A user can select a desired recognition candidate from a choice list produced by such filtered word recognition. Ambiguous alphabetic filtering can be performed iteratively in response to the addition of successive ambiguous key presses. A user can select to re-recognize the utterance using filtering based on ambiguous key input after seeing the results of recognition without such filtering. Unambiguous alphabetic filtering can be performed by using multiple presses of an ambiguous key to disambiguate which letter is intended. A user can select between entering text by either large vocabulary speech recognition or by spelling text by pressing phone keys.
203 Citations
22 Claims
-
1. A method of performing large vocabulary speech recognition comprising:
-
receiving a filtering sequence of one or more key-press signals each of which indicates which of a plurality of keys has been selected by a user, where each of the keys represents two or more letters; receiving an acoustic representation of a key-disambiguating utterance made in association with a given key press signal in said filtering sequence; performing speech recognition upon the acoustic representation of the key-disambiguation utterance that favors recognition of letter identifying words identifying letters represented by the given key press signal; responding to a recognition of the given key press signal'"'"'s associated key-disambiguation utterance as a letter identifying word by causing the set of letters represented by the given key press signal in the filtering sequence to be substantially limited to a letter identified by the recognized letter identifying word; receiving an acoustic representation of a word utterance that represents one or more words; performing speech recognition upon the acoustic word utterance representation which scores word candidates as a function of the match between the acoustic representation and acoustic models of words; wherein the scoring of said word candidates favors word candidates containing a sequence of one or more alphabetic characters corresponding to the filtering sequence of key-press signals, where a candidate word is considered to contain a character sequence corresponding to the filtering sequence if each sequential character in the character sequence corresponds to one of the letters represented by its corresponding sequential key-press signal; wherein said method further includes; responding to a key press signal by displaying in user-perceivable form a set of one or more letter identifying words starting with each letter represented by the key press signal'"'"'s associated pressed key; favoring the recognition of an utterance made after the display of the pressed key'"'"'s associated letter identifying words as corresponding to one of said displayed words; and responding to recognition of one of said displayed words by said causing the set of letters represented by the key press signal in the filtering sequence to be substantially limited to the letter associated with the recognized displayed word.
-
-
2. A method of performing large vocabulary speech recognition comprising:
-
receiving a filtering sequence of one or more key-press signals each of which indicates which of a plurality of keys has been selected by a user, where each of the keys represents two or more letters; receiving an acoustic representation of a key-disambiguating utterance made in association with a given key press signal in said filtering sequence; performing speech recognition upon the acoustic representation of the key-disambiguation utterance that favors recognition of letter identifying words identifying letters represented by the given key press signal; responding to a recognition of the given key press signal'"'"'s associated key-disambiguation utterance as a letter identifying word by causing the set of letters represented by the given key press signal in the filtering sequence to be substantially limited to a letter identified by the recognized letter identifying word; receiving an acoustic representation of a word utterance that represents one or more words; performing speech recognition upon the acoustic word utterance representation which scores word candidates as a function of the match between the acoustic representation and acoustic models of words; wherein; the scoring of said word candidates favors word candidates containing a sequence of one or more alphabetic characters corresponding to the filtering sequence of key-press signals, where a candidate word is considered to contain a character sequence corresponding to the filtering sequence if each sequential character in the character sequence corresponds to one of the letters represented by its corresponding sequential key-press signal; and each key press signal of the filtering sequence has a time period associated with it that starts after the previous key press signal in the sequence, if any, and ends with or before the subsequent key press signal in the sequence, if any; and a received key-disambiguating utterance is associated with a given key press signal if it is received in the utterance duration associated with that key press. - View Dependent Claims (3, 4)
-
-
5. A computerized method of performing speech recognition comprising:
-
receiving a filtering sequence of one or more key-press signals each of which indicates which of a plurality of keys has been selected by a user, where each of the keys represents two or more letters; receiving an acoustic representation of a key-disambiguating utterance made in association with a given key press signal in said filtering sequence; performing speech recognition upon the acoustic representation of the key-disambiguation utterance that favors recognition of letter identifying words identifying letters represented by the given key press signal; responding to a recognition of the given key press signal'"'"'s associated key-disambiguation utterance as a letter identifying word by causing the set of letters represented by the given key press signal in the filtering sequence to be substantially limited to a letter identified by the recognized letter identifying word; receiving an acoustic representation of a word utterance that represents one or more words; performing speech recognition upon the acoustic word utterance representation which scores word candidates as a function of the match between the acoustic representation and acoustic models of words; wherein; the scoring of said word candidates favors word candidates containing a sequence of one or more alphabetic characters corresponding to the filtering sequence of key-press signals, where a candidate word is considered to contain a character sequence corresponding to the filtering sequence if each sequential character in the character sequence corresponds to one of the letters represented by its corresponding sequential key-press signal; and each key press signal of the filtering sequence has a time period associated with it that starts after the previous key press signal in the sequence, if any, and ends with or before the subsequent key press signal in the sequence, if any; and a received key-disambiguating utterance is associated with a given key press signal if it is received in the utterance duration associated with that key press; and wherein said method further includes; outputting a plurality of the word candidates produced by said speech recognition in a user-perceivable form in a choice list; and responding to a user selection of one of the output word candidates by selecting it as the one or more recognized word for the recognition. - View Dependent Claims (6, 7)
-
-
8. A computerized method of performing speech recognition comprising:
-
receiving a filtering sequence of one or more key-press signals each of which indicates which of a plurality of keys has been selected by a user, where each of the keys represents two or more letters; receiving an acoustic representation of a key-disambiguating utterance made in association with a given key press signal in said filtering sequence; performing speech recognition upon the acoustic representation of the key-disambiguation utterance that favors recognition of letter identifying words identifying letters represented by the given key press signal; responding to a recognition of the given key press signal'"'"'s associated key-disambiguation utterance as a letter identifying word by causing the set of letters represented by the given key press signal in the filtering sequence to be substantially limited to a letter identified by the recognized letter identifying word; receiving an acoustic representation of a word utterance that represents one or more words; performing speech recognition upon the acoustic word utterance representation which scores word candidates as a function of the match between the acoustic representation and acoustic models of words; wherein; the scoring of said word candidates favors word candidates containing a sequence of one or more alphabetic characters corresponding to the filtering sequence of key-press signals, where a candidate word is considered to contain a character sequence corresponding to the filtering sequence if each sequential character in the character sequence corresponds to one of the letters represented by its corresponding sequential key-press signal; and each key press signal of the filtering sequence has a time period associated with it that starts after the previous key press signal in the sequence, if any, and ends with or before the subsequent key press signal in the sequence, if any; and a received key-disambiguating utterance is associated with a given key press signal if it is received in the utterance duration associated with that key press; and said performing of speech recognition that favors candidates containing a sequence of characters corresponding to the filter sequence is performed repeatedly for a given acoustic word utterance representation in response to the receipt of successive key-press signals in said filtering sequence. - View Dependent Claims (9, 10)
-
-
11. A computerized method of performing speech recognition comprising:
-
receiving an acoustic representation of a word utterance that represents one or more words; performing speech recognition upon the acoustic word utterance representation that scores word candidates as a function of the match between the acoustic representation and acoustic models of words; providing a user perceivable output indicating the one or more words of the word candidate selected by the speech recognition as most probably corresponding to said word utterance representation; providing a user interface that enable a user to select to respond to such an output by entering a filtering sequence to filter recognition of said utterance representation, which filtering sequence includes one or more key-press signals each of which indicates which of a plurality of keys has been selected by a user, where each of the key-press signals represents two or more letters; receiving a filtering sequence of one or more key-press signals each of which indicates which of a plurality of keys has been selected by a user, where each of the keys represents two or more letters; receiving an acoustic representation of a key-disambiguating utterance made in association with a given key press signal in said filtering sequence; performing speech recognition upon the acoustic representation of the key-disambiguation utterance that favors recognition of letter identifying words identifying letters represented by the given key press signal; responding to a recognition of the given key press signal'"'"'s associated key-disambiguation utterance as a letter identifying word by causing the set of letters represented by the given key press signal in the filtering sequence to be substantially limited to a letter identified by the recognized letter identifying word; re-performing speech recognition upon the acoustic word utterance representation which scores word candidates as a function of the match between the acoustic representation and acoustic models of words; wherein; the scoring of said word candidates favors word candidates containing a sequence of one or more alphabetic characters corresponding to the filtering sequence of key-press signals, where a candidate word is considered to contain a character sequence corresponding to the filtering sequence if each sequential character in the character sequence corresponds to one of the letters represented by its corresponding sequential key-press signal; and each key press signal of the filtering sequence has a time period associated with it that starts after the previous key press signal in the sequence, if any, and ends with or before the subsequent key press signal in the sequence, if any; and a received key-disambiguating utterance is associated with a given key press signal if it is received in the utterance duration associated with that key press; and providing a user perceivable output indicating the one or more words of the word candidate selected by the re-performing of speech recognition as most probably corresponding to said word utterance. - View Dependent Claims (12, 13)
-
-
14. A computerized method of inputting a sequence of one or more alphabetic characters into a computing system comprising performing the following for each character in the sequence:
-
receiving a given key-press signal indicating which of a plurality of keys has been selected by a user, where; the given key press signal is ambiguous in that it indicates that one of two or more letters associated with it has been selected; and the key press signal has a time period associated with it that starts after the previous key press signal in the input sequence, if any, and ends with or before the subsequent key press signal in the input sequence, if any; responding to receipt of an acoustic representation of an utterance during the time period associated with the given key press signal by; associating the utterance with the key press signal; and performing speech recognition upon the acoustic representation to select a best scoring word for the utterance, with the recognition favoring recognition of a letter identifying word that identifies one of the letters associated with the given key press signal; and responding to the selection of a letter identifying word as the best scoring word by treating the letter identified by said best scoring word as the letter input by the user in association with the associated key press signal. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
-
22. A computerized method of performing alphabetic input using user selectable keys:
-
receiving a sequence of one or more key-press signals, each of which indicates which of a plurality of keys has been selected by a user for its position in the sequence, where each of the keys represents a plurality of letters; responding to a given key-press signal by displaying in user-perceivable form a separate letter identifying word for each of said the plurality of letters represented by the given key-press'"'"'s key; receiving an acoustic representation of a key-disambiguating utterance made in association with the given key-press signal; performing speech recognition upon the acoustic representation of the key-disambiguation utterance, which recognition favors recognition of one of said letter identifying displayed in association with the given key-press signal; responding to a recognition of the given key-press signal'"'"'s associated key-disambiguation utterance as a given letter identifying word by increasing the probability the letter output in association with the given key-press signal will be that associated with the recognized letter identifying word; and outputting a sequence of one or more alphabetic characters corresponding to the sequence of key-press signals, in which each character in the sequence corresponds to one of the set of letters represented by a corresponding key-press signal in said sequence of key-press signals, as affected by changes in probability caused by said recognition of said key-disambiguation utterance.
-
Specification