METHOD, SYSTEM AND COMPUTER PROGRAM FOR ENHANCED SPEECH RECOGNITION OF DIGITS INPUT STRINGS

US 20090125306A1
Filed: 09/19/2008
Published: 05/14/2009
Est. Priority Date: 09/19/2007
Status: Active Grant

First Claim

Patent Images

1. A method for speech recognition comprising:

for an expected input string divided into a plurality of expected string segments, receiving a speech segment for each expected string segment;

performing speech recognition separately on each said speech segment, wherein said performing speech recognition comprisesgenerating, for each said speech segment, a segment n-best list comprising n highest confidence score results of said speech recognition, where n is an integer;

generating a global n-best list corresponding to said expected input string utilizing said segment n-best lists; and

determining a final global speech recognition result corresponding to said expected input string, wherein said determining said final global speech recognition result comprisespruning results of said global n-best list utilizing a pruning criterion.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention proposes a method, system and computer program for speech recognition. According to one embodiment, a method is provided wherein, for an expected input string divided into a plurality of expected string segments, a speech segment is received for each expected string segment. Speech recognition is then performed separately on each said speech segment via the generation, for each said speech segment, of a segment n-best list comprising n highest confidence score results. A global n-best list is then generated corresponding to the expected input string utilizing the segment n-best lists and a final global speech recognition result corresponding to said expected input string is determined via the pruning of the results of the global n-best list utilizing a pruning criterion.

Citations

14 Claims

1. A method for speech recognition comprising:
- for an expected input string divided into a plurality of expected string segments, receiving a speech segment for each expected string segment;
  
  performing speech recognition separately on each said speech segment, wherein said performing speech recognition comprisesgenerating, for each said speech segment, a segment n-best list comprising n highest confidence score results of said speech recognition, where n is an integer;
  
  generating a global n-best list corresponding to said expected input string utilizing said segment n-best lists; and
  
  determining a final global speech recognition result corresponding to said expected input string, wherein said determining said final global speech recognition result comprisespruning results of said global n-best list utilizing a pruning criterion.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method according to claim 1, wherein receiving a speech segment for each expected string segment further comprises:
    - receiving a first speech segment corresponding to a first expected string segment from a speaker; and
      
      prompting the speaker to speak a second speech segment corresponding to a second expected string segment.
  - 3. The method according to claim 1, further comprising:
    - receiving a single continuous speech segment input corresponding to said expected input string from a speaker;
      
      determining one or more time positions within said single continuous speech segment input utilizing a signal received from said speaker to indicate one or more input speech segments; and
      
      dividing said single continuous speech segment input into a plurality of input speech segments utilizing said one or more time positions.
  - 4. The method according to claim 1, wherein performing speech recognition comprises:
    - performing a grammar analysis speech recognition utilizing a determined maximum length of the expected input string.
  - 5. The method according to claim 1, wherein performing speech recognition comprises:
    - performing a grammar analysis speech recognition utilizing a determined exact length of the expected input string.
  - 6. The method according to claim 2, further comprising:
    - receiving data representing a length of at least one expected string segment of said plurality of expected string segments from said speaker; and
      
      restricting, utilizing said data, a grammar analysis of a speech recognition on said at least one expected string segment
  - 7. The method according to claim 1, further comprising:
    - determining a weight for each result in said global n-best list, wherein said determining said weight comprisescalculating each said weight utilizing a plurality of weights associated with corresponding segment n-best lists results composing said each result in said global n-best list.
  - 8. The method according to claim 7, wherein calculating each said weight comprises:
    - summing said plurality of weights associated with said corresponding segment n-best lists results.
  - 9. The method according to claim 1, further comprising:
    - prompting a speaker to repeat a speech segment in response to a determination that speech recognition performed on said speech segment fails to meet a predetermined accuracy threshold.
  - 10. The method according to claim 1, further comprising:
    - determining an accuracy level of said final global speech recognition result utilizing user input.
  - 11. The method according to claim 1, wherein the expected input string corresponds to a credit card number.
  - 12. The method according to claim 11, wherein the pruning criterion comprises the Luhn algorithm.

13. One or more machine-readable media having stored therein a program product, which when executed by a set of one or more processors causes the set of one or more processors to perform a method comprising:
- for an expected input string divided into a plurality of expected string segments, receiving a speech segment for each expected string segment;
  
  performing speech recognition separately on each said speech segment, wherein said performing speech recognition comprisesgenerating, for each said speech segment, a segment n-best list comprising n highest confidence score results of said speech recognition, where n is an integer;
  
  generating a global n-best list corresponding to said expected input string utilizing said segment n-best lists; and
  
  determining a final global speech recognition result corresponding to said expected input string, wherein said determining said final global speech recognition result comprisespruning results of said global n-best list utilizing a pruning criterion.

14. A system for speech recognition comprising:
- a set of one or more processors;
  
  a memory unit coupled with the set of one or more processors; and
  
  a speech recognition unit operable to,for an expected input string divided into a plurality of expected string segments, receive a speech segment for each expected string segment;
  
  perform speech recognition separately on each said speech segment, wherein said speech recognition comprisesgenerating, for each said speech segment, a segment n-best list comprising n highest confidence score results of said speech recognition, where n is an integer;
  
  generate a global n-best list corresponding to said expected input string utilizing said segment n-best lists; and
  
  determine a final global speech recognition result corresponding to said expected input string, wherein determining said final global speech recognition result comprisespruning results of said global n-best list utilizing a pruning criterion.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Crepy, Hubert, Lejeune, Remi

Granted Patent

US 8,589,162 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/236
CPC Class Codes

G10L 15/08 Speech classification or se...

METHOD, SYSTEM AND COMPUTER PROGRAM FOR ENHANCED SPEECH RECOGNITION OF DIGITS INPUT STRINGS

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

METHOD, SYSTEM AND COMPUTER PROGRAM FOR ENHANCED SPEECH RECOGNITION OF DIGITS INPUT STRINGS

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links