Process for the multilingual use of a hidden markov sound model in a speech recognition system

US 6,212,500 B1
Filed: 03/09/1999
Issued: 04/03/2001
Est. Priority Date: 09/10/1996
Status: Expired due to Term

First Claim

Patent Images

1. A method for modelling a sound in at least two languages, comprising the steps of:

(a) identifying a first feature vector for a first spoken sound in a first language;

(b) identifying a first hidden Markov sound model, from among a plurality of standard Markov sound models in a Markov sound model library, which most closely models said first feature vector;

(c) identifying a second feature vector for a second spoken sound, comparable to said first spoken sound, in a second language;

(d) identifying a second hidden Markov sound model from among said plurality of standard Markov sound models in said Markov sound model library, which most closely models said second feature vector;

(e) employing a predetermined criterion to select one of said first and second hidden Markov sound models as better modelling both of said first and second feature vectors; and

(f) modelling said first and second spoken sounds in both of said first and second languages using said one of said first and second hidden Markov sound models.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In a method for determining the similarities of sounds across different languages, hidden Markov modelling of multilingual phonemes is employed wherein language-specific as well as language-independent properties are identified by combining of the probability densities for different hidden Markov sound models in various languages.

50 Citations

View as Search Results

9 Claims

1. A method for modelling a sound in at least two languages, comprising the steps of:
- (a) identifying a first feature vector for a first spoken sound in a first language;
  
  (b) identifying a first hidden Markov sound model, from among a plurality of standard Markov sound models in a Markov sound model library, which most closely models said first feature vector;
  
  (c) identifying a second feature vector for a second spoken sound, comparable to said first spoken sound, in a second language;
  
  (d) identifying a second hidden Markov sound model from among said plurality of standard Markov sound models in said Markov sound model library, which most closely models said second feature vector;
  
  (e) employing a predetermined criterion to select one of said first and second hidden Markov sound models as better modelling both of said first and second feature vectors; and
  
  (f) modelling said first and second spoken sounds in both of said first and second languages using said one of said first and second hidden Markov sound models.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. A method as claimed in claim 1 wherein identification of the first hidden Markov sound model which most closely models said first feature vector in step (b) and identification of said second hidden Markov sound model which most closely models said second feature vector in step (d) comprise identifying a logarithmic probability distance as a log likelihood distance between each standard Markov sound model in said library and said first feature vector and said second feature vector, respectively, with a shorter logarithmic probability distance denoting a better modelling.
  - 3. A method as claimed in claim 2 wherein step (e) comprises forming an arithmetic mean of said logarithmic probability distance between each Markov sound model in said library and said first feature vector and said second feature vector, respectively, and using said arithmetic mean as said predetermined criterion.
  - 4. A method as claimed in claim 3 wherein said first hidden Markov model is for a phoneme λ
    - _i, wherein said second hidden Markov sound model is for a phoneme λ
      
      _j, and wherein X_irepresents said first feature vector and wherein X_jrepresents said second feature vector, and wherein the step of identifying the logarithmic probability distance for said first feature vector comprises using the relationship
5. A method as claimed in claim 4 comprising the additional step of employing the selected one of said first and second hidden Markov sound models from step (e) for modelling of said first and second spoken words in step (f) only if d(λ
- _j;
  
  λ
  
  _i) satisfies a defined barrier condition.
6. A method as claimed in claim 1 comprising the additional step of providing a library of three-state Markov sound models as said Markov sound model library, each three-state Markov sound model comprising a sound segment of initial sound, median sound and final sound.

7. A method for multilingual employment of a hidden Markov sound model in a speech recognition system, comprising the steps of:
- (a) identifying a first hidden Markov sound model for a first spoken sound in a first language, said first hidden Markov sound model having a first standard probability distribution associated therewith;
  
  (b) identifying a second hidden Markov sound model for a second spoken sound, comparable to said first spoken sound, in a second language, said second hidden Markov sound model having a second standard probability distribution associated therewith;
  
  (c) combining said first standard probability distribution and said second standard probability distribution to form a new standard probability distribution up to a defined distance threshold, said defined distance threshold identifying a maximum distance between said first and second probability distributions within which said first and second standard probability distributions should be combined;
  
  (d) forming a polyphoneme model using said new standard probability distribution only within said defined distance threshold and modelling said first and second sounds in both of said first and second languages using said polyphoneme model.
- View Dependent Claims (8, 9)
- - 8. A method as claimed in claim 7 wherein said distance threshold is five.
  - 9. A method as claimed in claim 7 comprising employing a three-state Markov sound model as said first hidden Markov sound model in step (a), said three-state Markov sound model comprising an initial sound segment, a median sound segment and a final sound segment of said first sound, and employing a three-state Markov sound model as said second hidden Markov sound model formed by an initial sound segment, a median sound segment and a final sound segment of said second sound.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Siemens AG
Inventors
Kohler, Joachim
Primary Examiner(s)
Knepper, David D.
Assistant Examiner(s)
SAX, ROBERT L

Application Number

US09/254,775
Time in Patent Office

756 Days
Field of Search

704/231, 704/243, 704/257, 704/256, 704/244, 704/245, 704/9, 704/10, 704/1, 704/2
US Class Current

704/256
CPC Class Codes

G10L 15/005   Language recognition

G10L 15/065   Adaptation

G10L 15/144   Training of HMMs

G10L 2015/0631   Creating reference template...

G10L 2015/0635   updating or merging of old ...

Process for the multilingual use of a hidden markov sound model in a speech recognition system

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

50 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Process for the multilingual use of a hidden markov sound model in a speech recognition system

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

50 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links