Using a spoken utterance for disambiguation of spelling inputs into a speech recognition system

US 20060229870A1
Filed: 03/30/2005
Published: 10/12/2006
Est. Priority Date: 03/30/2005
Status: Active Grant

First Claim

Patent Images

1. A method of verifying a speech input comprising:

determining pronunciation data for a received user spoken utterance specifying a word;

speech recognizing further user spoken utterances specifying individual characters of the word, wherein an N-best list is generated for each character;

automatically generating word candidates using the N-best list for each character; and

comparing the pronunciation data with the word candidates to determine at least one match.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of verifying a speech input can include determining pronunciation data for a received user spoken utterance specifying a word and speech recognizing further user spoken utterances specifying individual characters of the word. An N-best list can be generated for each character. Word candidates can be generated using the N-best list for each character. The pronunciation data can be compared with the word candidates to determine at least one match.

177 Citations

20 Claims

1. A method of verifying a speech input comprising:
- determining pronunciation data for a received user spoken utterance specifying a word;
  
  speech recognizing further user spoken utterances specifying individual characters of the word, wherein an N-best list is generated for each character;
  
  automatically generating word candidates using the N-best list for each character; and
  
  comparing the pronunciation data with the word candidates to determine at least one match.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, further comprising dynamically generating a grammar of the word candidates, such that the pronunciation data is compared with the grammar to determine a match.
  - 3. The method of claim 1, said speech recognizing step further comprising determining at least one alternative character for each N-best list.
  - 4. The method of claim 1, wherein the pronunciation data comprises acoustic data corresponding to the user spoken utterance.
  - 5. The method of claim 1, said automatically generating step comprising creating word candidates based upon the N-best lists in accordance with a dictionary of allowable words.
  - 6. The method of claim 1, said automatically generating step comprising creating word candidates using the N-best lists without restriction from a dictionary of allowable words.
  - 7. The method of claim 1, further comprising:
    - first determining a domain of words; and
      
      comparing the pronunciation data with a set of common words of the domain to find a match.

8. A method of processing a speech input comprising:
- selecting a domain of words;
  
  determining pronunciation data for a word specified by a received user spoken utterance;
  
  comparing the pronunciation data for the word with a list of common words of the domain to find a match;
  
  if a match is found, discontinuing further speech processing; and
  
  if a match is not found, speech recognizing further user spoken utterances specifying a plurality of individual characters of the word for comparison to the pronunciation data.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. The method of claim 8, said speech recognizing step further comprising:
    - determining an N-best list for each of the plurality of characters;
      
      automatically generating word candidates using the N-best lists; and
      
      comparing the pronunciation data with the word candidates to determine at least one match.
  - 10. The method of claim 9, further comprising including the word candidates in a grammar, such that the pronunciation data is compared with the grammar to determine a match.
  - 11. The method of claim 9, said step of determining an N-best list comprising identifying at least one alternative character for each of the plurality of characters.
  - 12. The method of claim 9, said step of automatically generating word candidates comprising creating word candidates based upon the N-best lists in accordance with a dictionary of allowable words.
  - 13. The method of claim 9, said step of automatically generating word candidates comprising creating word candidates using the N-best lists without restriction from a dictionary of allowable words.

14. A machine readable storage, having stored thereon a computer program having a plurality of code sections executable by a machine for causing the machine to perform the steps of:
- determining pronunciation data for a received user spoken utterance specifying a word;
  
  speech recognizing further user spoken utterances specifying individual characters of the word, wherein an N-best list is generated for each character;
  
  automatically generating word candidates using the N-best list for each character; and
  
  comparing the pronunciation data with the word candidates to determine at least one match.
- View Dependent Claims (15, 16, 17, 18, 19, 20)
- - 15. The machine readable storage of claim 14, further comprising dynamically generating a grammar of the word candidates, such that the pronunciation data is compared with the grammar to determine a match.
  - 16. The machine readable storage of claim 14, said speech recognizing step further comprising determining at least one alternative character for each N-best list.
  - 17. The machine readable storage of claim 14, wherein the pronunciation data comprises acoustic data corresponding to the user spoken utterance.
  - 18. The machine readable storage of claim 14, said automatically generating step comprising creating word candidates based upon the N-best lists in accordance with a dictionary of allowable words.
  - 19. The machine readable storage of claim 14, said automatically generating step comprising creating word candidates using the N-best lists without restriction from a dictionary of allowable words.
  - 20. The machine readable storage of claim 14, further comprising:
    - first determining a domain of words; and
      
      comparing the pronunciation data for the word with a set of common words of the domain to find a match.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Kobal, Jeffrey S.

Granted Patent

US 7,529,678 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/252
CPC Class Codes

G10L 15/22 Procedures used during a sp...

Using a spoken utterance for disambiguation of spelling inputs into a speech recognition system

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

177 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Using a spoken utterance for disambiguation of spelling inputs into a speech recognition system

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

177 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links