Apparatus and methods for speech recognition including individual or speaker class dependent decoding history caches for fast word acceptance or rejection

US 5,937,383 A
Filed: 06/04/1997
Issued: 08/10/1999
Est. Priority Date: 02/02/1996
Status: Expired due to Fees

First Claim

Patent Images

1. A method for performing speech recognition on speech segments frequently input by a user, the method comprising the steps of:

(a) inputting at least one keyword spoken by the user;

(b) decoding the at least one keyword by scoring the at least one keyword against a speech recognition vocabulary to generate a decoded keyword and at least one score for the decoded keyword;

(c) storing the decoded keyword and the at least one score;

(d) inputting a speech segment spoken by the user;

(e) comparing the input speech segment to the decoded keyword in order to generate a temporary score; and

(f) comparing the temporary score against the at least one stored score and if the temporary score is one of within a predetermined margin of, equivalent to, and larger than the at least one stored score, then the decoded keyword is output as being representative of the input speech segment, else the input speech segment is scored against the speech recognition vocabulary to generate a second decoded keyword and at least one score for the second decoded keyword.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and an apparatus are provided for performing speech recognition on speech segments frequently input by a user. The method and the apparatus include use of keyword scoring in connection with a speech recognition vocabulary, a temporary score, and a predetermined margin to determine an appropriate output as being representative of the input speech segment.

Citations

25 Claims

1. A method for performing speech recognition on speech segments frequently input by a user, the method comprising the steps of:
- (a) inputting at least one keyword spoken by the user;
  
  (b) decoding the at least one keyword by scoring the at least one keyword against a speech recognition vocabulary to generate a decoded keyword and at least one score for the decoded keyword;
  
  (c) storing the decoded keyword and the at least one score;
  
  (d) inputting a speech segment spoken by the user;
  
  (e) comparing the input speech segment to the decoded keyword in order to generate a temporary score; and
  
  (f) comparing the temporary score against the at least one stored score and if the temporary score is one of within a predetermined margin of, equivalent to, and larger than the at least one stored score, then the decoded keyword is output as being representative of the input speech segment, else the input speech segment is scored against the speech recognition vocabulary to generate a second decoded keyword and at least one score for the second decoded keyword.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, further comprising the step of storing the second decoded keyword and the at least one score associated therewith.
  - 3. The method of claim 1, further comprising the step of storing the decoded keyword and scores associated therewith in accordance with a predetermined identity of the user.
  - 4. The method of claim 3, further comprising the step of identifying the user via text-independent speaker identification.
  - 5. The method of claim 3, further comprising the step of identifying the user via speaker-independent speaker classification.
  - 6. The method of claim 1, wherein the at least one keyword is a name.
  - 7. The method of claim 6, wherein said method is utilized in a name-based voice dialing phone system.
  - 8. The method of claim 1, wherein the at least one keyword is a command.
  - 9. The method of claim 8, wherein said method is utilized in a command-based voice controlled system.
  - 10. The method of claim 1, wherein the at least one keyword is from a large vocabulary associated with a speech recognition system.

11. Apparatus for performing speech recognition on speech segments frequently input by a user, the apparatus comprising:
- means for inputting at least one keyword spoken by the user;
  
  means for decoding the at least one keyword by scoring the at least one keyword against a speech recognition vocabulary to generate a decoded keyword and at least one score for the decoded keyword;
  
  means for storing the decoded keyword and the at least one score;
  
  means for inputting a speech segment spoken by the user;
  
  means for comparing the input speech segment to the decoded keyword in order to generate a temporary score; and
  
  means for comparing the temporary score against the at least one stored score and if the temporary score is one of within a predetermined margin of, equivalent to, and larger than the at least one stored score, then the decoded keyword is output as being representative of the input speech segment, else the input speech segment is scored against the speech recognition vocabulary to generate a second decoded keyword and at least one score for the second decoded keyword.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The apparatus of claim 11, further comprising means for storing the second keyword and the at least one score associated therewith.
  - 13. The apparatus of claim 11, further comprising means for storing the decoded keyword and scores associated therewith in accordance with a predetermined identity of the user.
  - 14. The apparatus of claim 13, further comprising means for identifying the user via text-independent speaker identification.
  - 15. The method of claim 13, further comprising means for identifying the user via speaker-independent speaker classification.
  - 16. The apparatus of claim 11, wherein the at least one keyword is a name.
  - 17. The apparatus of claim 16, wherein said apparatus is utilized in a name-based voice dialing phone system.
  - 18. The apparatus of claim 11, wherein the at least one keyword is a command.
  - 19. The apparatus of claim 18, wherein said apparatus is utilized in a command-based voice controlled system.
  - 20. The apparatus of claim 11, wherein the at least one keyword is from a large vocabulary associated with a speech recognition system.

21. A system for recognizing keywords frequently input by a speaker, the system comprising:
- a speech recognition engine for decoding at least one keyword uttered by the speaker by scoring the at least one keyword against a speech recognition vocabulary to generate a decoded keyword and at least one score for the decoded keyword;
  
  a cache database for storing the decoded keyword and the at least one score associated therewith in accordance with a predetermined identity of the speaker;
  
  means for performing a Viterbi alignment process on an input speech segment uttered by the speaker wherein the input speech segment is compared to the decoded keyword to generate a temporary score; and
  
  a comparator for comparing the temporary score against the at least one stored score and if the temporary score is one of within a predetermined margin of, equivalent to, and larger than the at least one stored score, then the decoded keyword is output as being representative of the input speech segment, else the input speech segment is scored against the speech recognition vocabulary to generate a second decoded keyword and at least one score for the second decoded keyword.
- View Dependent Claims (22, 23, 24, 25)
- - 22. The system of claim 21, wherein the second keyword and the at least one score associated therewith are stored in the cache database.
  - 23. The system of claim 21, wherein the identity of the speaker is determined via text-independent speaker identification.
  - 24. The system of claim 21, wherein the identity of the speaker is determined via speaker-independent speaker classification.
  - 25. The system of claim 21, wherein the at least one keyword is one of a name, a command, and at least one word from a large vocabulary associated with a speech recognition system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Maes, Stephane Herman, Ittycheriah, Abraham Poovakunnel
Primary Examiner(s)
Dorvil, Richemond

Application Number

US08/869,025
Time in Patent Office

797 Days
Field of Search

704/255, 704/275, 704/270, 704/239, 704/240, 704/241, 704/242, 704/236
US Class Current

704/255
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 2015/088   Word spotting

G10L 21/0272   Voice signal separating

G10L 25/12   the extracted parameters be...

Apparatus and methods for speech recognition including individual or speaker class dependent decoding history caches for fast word acceptance or rejection

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Apparatus and methods for speech recognition including individual or speaker class dependent decoding history caches for fast word acceptance or rejection

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links