Method and apparatus for automatic recognition of long sequences of spoken digits

US 20030023439A1
Filed: 05/02/2001
Published: 01/30/2003
Est. Priority Date: 05/02/2001
Status: Abandoned Application

First Claim

Patent Images

1. A method of recognizing speech in systems that accept speech input, comprising:

(a) receiving at least a current subgroup of speech units that form part of a complete speech sequence that is to be input from a user;

(b) detecting a natural pause between input subgroups;

(c) recognizing the speech units of the subgroup to provide a recognition result; and

(d) immediately feeding back the recognition result for verification by the user,

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system of recognizing speech based in part on an observation that a speaker naturally pauses and speaks smaller subgroups of speech units or digits that form part of a complete longer speech sequence. In the method, subgroups of speech units are processed by the system between a human'"'"'s natural pauses. This pause is detected by the system and the subgroup is processed in order to provide a recognition result, which is a best representation of the input subgroup. The recognition result is immediately repeated back to the user for verification. The user is prompted to repeat a subgroup for re-recognition and re-verification if a rejection criteria is met; otherwise the processing steps are repeated for remaining subgroups until it has been determined that the complete speech sequence has been accurately recognized.

42 Citations

View as Search Results

27 Claims

1. A method of recognizing speech in systems that accept speech input, comprising:
- (a) receiving at least a current subgroup of speech units that form part of a complete speech sequence that is to be input from a user;
  
  (b) detecting a natural pause between input subgroups;
  
  (c) recognizing the speech units of the subgroup to provide a recognition result; and
  
  (d) immediately feeding back the recognition result for verification by the user,
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein said user is only prompted to repeat said subgroup for re-recognition and re-verification if a rejection criteria is met.
  - 3. The method of claim 1, further comprising:
    - (e) repeating steps (a) to (d) for remaining input subgroups until it is determined that the complete speech sequence has been recognized.
  - 4. The method of claim 1, wherein step (d) is effected using pre-recorded prompts or via text-to-speech synthesis, (TTS) to feedback the recognition result.
  - 5. The method of claim 2, wherein said rejection criteria is embodied as a negative utterance spoken by the user after receiving the fed back recognition result.
  - 6. The method of claim 2, wherein said rejection criteria is embodied as a negative utterance spoken by the user concurrent with inputting the subgroup that is recognized in step (c).
  - 7. The method of claim 2, wherein if said rejection criteria are met repeatedly, the user is prompted to speak the subgroups in smaller groups of speech units.
  - 8. The method of claim 7, wherein said prompt to speak subgroups in smaller groups of speech units provides a built in training mechanism for the user.
  - 9. The method of claim 2, wherein if said rejection criteria are met repeatedly, the user is prompted to use a dial pad to enter the speech units.
  - 10. The method of claim 1, wherein said speech units are selected from any of spoken digits, spoken letters and spoken words.
  - 11. The method of claim 1, wherein input of a next subgroup after receiving the fed back recognition result indicates a correct recognition of the currently input subgroup.
  - 12. The method of claim 2, wherein said rejection criteria requires determining a level of confidence in said recognition result.

13. An automatic speech recognition system, comprising:
- a receiver for receiving at least a current subgroup of speech units that form part of a complete speech sequence that is to be input by a user;
  
  a detector for detecting a natural pause after receiving the subgroup;
  
  a decoder for detecting a natural pause between input subgroups to output a recognition result representative of the current subgroup; and
  
  a controller for evaluating the output recognition result and feeding back the recognition result to the user.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
- - 14. The system of claim 13, wherein said user is only prompted to repeat said subgroup for re-recognition and re-verification if a rejection criteria is met.
  - 15. The system of claim 13, wherein the decoder compares the input subgroup with stored recognition grammar in order to determine the recognition result.
  - 16. The system of claim 18, wherein the recognition grammar is stored in a remote memory accessible by the decoder.
  - 17. The system of claim 14, wherein the recognition result includes at least one of a subgroup of speech units and a negative utterance representation that is included in the recognition result, and wherein the rejection criteria is met if the negative utterance is included therein.
  - 18. The system of claim 14, wherein said rejection criteria is met if the user speaks a negative utterance after receiving the fed back recognition result.
  - 19. The system of claim 14, wherein said rejection criteria is met if the user speaks a negative utterance while inputting the current subgroup, so that said recognition result includes the negative utterance.
  - 20. The system of claim 14, wherein the system remains active to process subsequent subgroups until it is determined that the complete speech sequence has been recognized.
  - 21. The system of claim 13, wherein said controller accesses pre-recorded prompts or a text-to-speech synthesis processor in order to effect feedback of the recognition result to the user.
  - 22. The system of claim 14, wherein if said rejection criteria is met repeatedly, said controller prompts the user to speak the subgroups in smaller groups of speech units.
  - 23. The system of claim 22, wherein said prompt to speak subgroups in smaller groups of speech units provides a built in training mechanism for the user.
  - 24. The system of claim 14, wherein if said rejection criteria is met repeatedly, said prompt generator prompts the user to use a dial pad to enter digits corresponding to the speech units.
  - 25. The system of claim 13, wherein said speech units are selected from any of spoken digits, spoken letter and spoken words.
  - 26. The system of claim 13, wherein input of a next subgroup after receiving the fed back recognition result indicates a correct recognition of the currently input subgroup.
  - 27. The system of claim 13, wherein said decoder determines a confidence level for said recognition result.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Ciurpita, Gregory, Ragavan, Prabhu, Gupta, Sunil K.

Application Number

US09/846,200
Publication Number

US 20030023439A1
Time in Patent Office

Days
Field of Search
US Class Current

704/246
CPC Class Codes

G10L 15/22 Procedures used during a sp...

G10L 2015/221 Announcement of recognition...

Method and apparatus for automatic recognition of long sequences of spoken digits

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

42 Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for automatic recognition of long sequences of spoken digits

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

42 Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links