System and method using N-best strategy for real time recognition of continuously spelled names

US 5,677,990 A
Filed: 05/05/1995
Issued: 10/14/1997
Est. Priority Date: 05/05/1995
Status: Expired due to Term

First Claim

Patent Images

1. A method for recognizing continuously spelled names input as a sequence of letters uttered into a microphonic transducer comprising:

providing a predetermined letter grammar, defining a plurality of groups of letters;

processing said sequence of letters through a speech recognizer using said letter grammar to produce a first list comprising a plurality of groups of letters representing a set of N-best letter sequence hypotheses, where N is an integer greater than one;

providing a name dictionary comprising a first plurality of names representing possible choices of said continuously spelled names;

performing alignment between said first list and said name dictionary and selecting a second plurality of names from said name dictionary that represents the N-best name candidates;

building a dynamic grammar using said second plurality of names selected in said alignment step;

processing said sequence of letters through a speech recognizer using said dynamic grammar to select one name from said second plurality of names as representing a best hypothesis for the continuously spelled name.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A multipass recognition strategy selects the N-best hypotheses resulting from each pass and propagates these N-best to the next pass. This strategy outperforms conventional hidden Markov model recognizers using a grammar constraining all possible names. Real time recognition of continuously spelled names is made feasible, in part, because the processor-intensive costly constraints are applied, if at all, in the 4th pass, after the system has produced a much smaller dynamic grammar.

Citations

12 Claims

1. A method for recognizing continuously spelled names input as a sequence of letters uttered into a microphonic transducer comprising:
- providing a predetermined letter grammar, defining a plurality of groups of letters;
  
  processing said sequence of letters through a speech recognizer using said letter grammar to produce a first list comprising a plurality of groups of letters representing a set of N-best letter sequence hypotheses, where N is an integer greater than one;
  
  providing a name dictionary comprising a first plurality of names representing possible choices of said continuously spelled names;
  
  performing alignment between said first list and said name dictionary and selecting a second plurality of names from said name dictionary that represents the N-best name candidates;
  
  building a dynamic grammar using said second plurality of names selected in said alignment step;
  
  processing said sequence of letters through a speech recognizer using said dynamic grammar to select one name from said second plurality of names as representing a best hypothesis for the continuously spelled name.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1 wherein said letters of said letter grammar are represented by a sequence of states and wherein said step of providing a predetermined letter grammar further including the step of:
    - tying states of at least a portion of said letters of said letter grammar.
  - 3. The method of claim 1 wherein said step of processing said sequence of letters through a speech recognizer using said letter grammar further includes the step of:
    - storing groups of letters representing possible letter sequences as a plurality of paths; and
      
      applying an adaptive path pruning threshold to decrease the number of paths needed to produce said first list.
  - 4. The method of claim 1 wherein said step of processing said sequence of letters through a speech recognizer using said dynamic grammar further includes the step of:
    - storing groups of letters representing possible letter sequences as a plurality of paths; and
      
      applying an adaptive path pruning threshold to decrease the number of paths needed to select one from said second plurality of names representing the best hypothesis for the continuously spelled name.
  - 5. The method of claim 1 wherein said step of processing said sequence of letters through a speech recognizer using said letter grammar further includes the step of:
    - using said speech recognizer to generate probability scores for each of said group of letters;
      
      storing the highest probability score;
      
      performing a local word pruning to eliminate groups of letters whose probability score is lower than said highest probability score.
  - 6. The method of claim 1 wherein said step of processing said sequence of letters through a speech recognizer using said dynamic grammar further includes the step of:
    - using said speech recognizer to generate probability scores for each of said group of letters;
      
      storing the highest probability score;
      
      performing a local word pruning to eliminate groups of letters whose probability score is lower than said highest probability score probability score.
  - 7. The method of claim 1 wherein said step of processing said sequence of letters through a speech recognizer using said letter grammar further includes the step of:
    - performing a hidden Markov model process with beam search to process said sequence of letters through a speech recognizer.
  - 8. The method of claim 1 wherein said step of processing said sequence of letters through a speech recognizer using said dynamic grammar further includes the step of:
    - performing a hidden Markov model process with beam search to process said sequence of letters through a speech recognizer.
  - 9. The method of claim 1 wherein said step of performing alignment includes the step of:
    - performing a dynamic time warping process to compare said first list to said name dictionary.
  - 10. The method of claim 1 further comprising the step of:
    - processing said sequence of letters through a neural network discrimination process to produce a second list comprising a plurality of groups of letters representing a second set of N-best hypotheses, where N is greater than one.
  - 11. The method of claim 10 wherein said neural network uses two frames to perform the discrimination.

12. An apparatus for recognizing continuously spelled names input as a sequence of letters uttered into a microphonic transducer, comprising:
- a first speech recognizer for processing said sequence of letters to produce a first list comprising a plurality of groups of letters representing a set of N-Best letter sequence hypotheses where N is an integer greater than one;
  
  a name dictionary for representing possible choices of said continuously spelled names;
  
  alignment means coupled to said first speech recognizer and said name dictionary for performing alignment between said first list and said name dictionary and selecting a first plurality of names from said name dictionary that represents the N-best name candidates;
  
  a dynamic grammar storage coupled to said alignment means for storing said first plurality of names; and
  
  a second speech recognizer coupled to said dynamic grammar storage for processing said sequence of letters to select one candidate from said first plurality of names as representing a best hypothesis for the continuously spelled name.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Panasonic Corporation Of North America (Panasonic Holdings Corporation)
Original Assignee
Panasonic Technologies, Inc. (Panasonic Holdings Corporation)
Inventors
Junqua, Jean-claude
Primary Examiner(s)
Zele, Krista M.
Assistant Examiner(s)
WEAVER, SCOTT LOUIS

Application Number

US08/435,881
Time in Patent Office

893 Days
Field of Search

395/2, 395/2.4, 395/2.56, 395/2.58, 395/2.6, 395/2.64, 395/2.63, 395/2.65, 395/2.84, 395/2.61, 395/2.79, 379/67, 379/88, 379/89
US Class Current

704/255
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/197   Probabilistic grammars, e.g...

System and method using N-best strategy for real time recognition of continuously spelled names

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

System and method using N-best strategy for real time recognition of continuously spelled names

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links