Method and apparatus for constructing continuous parameter fenonic hidden markov models by replacing phonetic models with continous fenonic models

US 5,737,490 A
Filed: 10/22/1996
Issued: 04/07/1998
Est. Priority Date: 09/30/1993
Status: Expired due to Term

First Claim

Patent Images

1. A method of constructing a hidden Markov model of a given speech event comprising the steps of:

providing a sequence of phonetic models representing the given speech event;

creating a plurality of fenones associated with acoustic vectors, wherein each of the plurality of fenones is a hidden Markov model representing an acoustic event; and

creating a fenonic model for said given speech event, wherein each of the sequence of phonetic models is replaced by at least one of the plurality of fenones, such that the fenonic model comprises a sequence of fenones.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for constructing a hidden Markov model comprised of multiple fenones characterized by their duration and a set of acoustic properties. The present invention provides a sequence of fenones to model a speech event. The sequence may undergo modifications to improve the overall performance of the model.

23 Citations

View as Search Results

31 Claims

1. A method of constructing a hidden Markov model of a given speech event comprising the steps of:
- providing a sequence of phonetic models representing the given speech event;
  
  creating a plurality of fenones associated with acoustic vectors, wherein each of the plurality of fenones is a hidden Markov model representing an acoustic event; and
  
  creating a fenonic model for said given speech event, wherein each of the sequence of phonetic models is replaced by at least one of the plurality of fenones, such that the fenonic model comprises a sequence of fenones.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 10, 11)
- - 2. The method defined in claim 1 wherein the step of creating of plurality of fenones includes characterizing each of the plurality of fenones by a duration and a set of acoustic properties.
  - 3. The method defined in claim 2 wherein the duration is characterized using a set of transition probabilities.
  - 4. The method defined in claim 2 wherein the set of acoustic properties includes a set of multivariate Gaussian distributions.
  - 5. The method defined in claim 4 wherein each of the set of multivariate Gaussian distributions includes a centroid and a covariance matrix.
  - 6. The method defined in claim 1 further comprising the step of modifying the fenonic model.
  - 7. The method defined in claim 6 wherein the step of modifying the fenonic model comprises inserting a fenone into the fenonic model.
  - 8. The method defined in claim 6 wherein the step of modifying the fenonic model comprises deleting a fenone from the fenonic model.
  - 10. The method defined in claim 6 further comprising the step of realigning the predetermined number of samples with the fenonic model.
  - 11. The method defined in claim 6 wherein the step of creating a plurality of fenones includes the steps of:
    - clustering acoustic vectors from training data aligned with each arc in the phonetic model, wherein acoustic vectors are clustered into a predetermined number of clusters; and
      
      generating the centroid and covariance matrix for each of said predetermined number of clusters.

9. A method for constructing a hidden Markov model for a speech event that is represented by a phonetic sequence, said method comprising the steps of:
- creating a plurality of fenones in a one-to-one correspondence with each of a plurality of distinct phonetic arcs in a phonetic model, wherein plurality of fenones are associated with acoustic vectors; and
  
  creating a fenonic model for a given speech event, wherein the step of creating a fenonic model includes the steps ofselecting a predetermined number of instances of said given speech event; and
  
  replacing each arc in the phonetic sequence by one of the plurality of fenones to create a sequence of fenones, wherein said one of the plurality of fenones is selected so as to maximize the joint likelihood of all vectors aligned with the arc in the predetermined number of samples, such that the fenonic model for the given speech event is produced.

15. A method for constructing a hidden Markov model for a speech event that is represented by a phonetic sequence, said method comprising the steps of:
- creating a plurality of fenones in a one-to-one correspondence with each of a plurality of distinct phonetic arcs; and
  
  creating a fenonic model for a given speech event, wherein the step of creating a fenonic model includes the steps ofselecting a predetermined number of instances of said given speech event; and
  
  replacing each arc in the phonetic sequence by one of the plurality of fenones to create a sequence of fenones, wherein said one of the plurality of fenones is selected so as to maximize the joint likelihood of all vectors aligned with the arc in the predetermined number of samples, such that the fenonic model with the sequence of fenones for the given speech event is produced;
  
  modifying the fenonic model.
- View Dependent Claims (12, 13, 14, 16, 17, 18, 19, 20, 21, 23)
- - 12. The method defined in claim 15 wherein the step of clustering is performed by k-means clustering.
  - 13. The method defined in claim 15 wherein the step of creating a plurality of fenones further includes specifying the duration of each of the plurality of fenones.
  - 14. The method defined in claim 17 wherein the duration is specified by assigning a set of transition probabilities to each of the plurality of fenones, wherein one of each of the set is assigned to each arc in one of the plurality of fenones.
  - 16. The method defined in claim 15 wherein the step of modifying includes substituting at least one of the plurality of fenones for a fenone in the sequence of fenones in the fenonic model.
  - 17. The method defined in claim 16 wherein said at least one of the plurality of fenones is substituted for said fenone if the joint likelihood of said at least one of the plurality of fenones less a predetermined threshold is greater than the joint likelihood of said fenone in the sequence of fenones for which said at least one of the plurality of fenones is substituted.
  - 18. The method defined in claim 16 wherein said fenone is replaced by a pair of fenones where at least one of said pair of fenones comprises said fenone.
  - 19. The method defined in claim 15 wherein the step of modifying includes inserting a fenone into the sequence of fenones in the fenonic model.
  - 20. The method defined in claim 15 wherein the step of modifying includes deleting a fenone in the sequence of fenones in the fenonic model.
  - 21. The method defined in claim 20 wherein said fenone is deleted from the sequence of fenones when the joint likelihood of a pair of fenones that comprises said fenone and a following and immediately adjacent fenone in said sequence less a predetermined threshold is greater than the joint likelihood of said fenone.
  - 23. The method defined in claim 19 wherein the step of aligning includes obtaining a Viterbi alignment.

22. A method for constructing a Markov model comprising the steps of:
- recording training data;
  
  processing the training data into a sequence of acoustic vectors;
  
  aligning the sequence of vectors with linear phonetic Markov models;
  
  creating a first alphabet of distinct phonetic arcs in the training data;
  
  for each arc in the first alphabet, extracting the acoustic vectors aligned with said each arc in the first alphabet;
  
  clustering the acoustic vectors aligned with said each arc to create a predetermined number of clusters;
  
  computing centroid and covariances matrices for each of said predetermined number of clusters;
  
  defining a second alphabet of a plurality of fenones, wherein the plurality of fenones are associated with acoustic vectors and each of the plurality of fenones is represented by an elementary HMM and is associated with a plurality of Gaussian distributions and a set of transition probabilities; and
  
  replacing each distinct arc in the phonetic sequence with one of the plurality of fenones, such that a fenonic model for the speech event is created.

24. A Markov model for a speech event created according to a method comprising the steps of:
- providing a sequence of phonetic models representing the given speech event;
  
  creating a plurality of fenones, wherein each of the plurality of fenones is a hidden Markov model representing an acoustic event; and
  
  creating a fenonic model for said given speech event, wherein each of the sequence of phonetic models is replaced by at least one of the plurality of fenones, such that the fenonic model includes a sequence of fenones from the plurality of fenones; and
  
  modifying the fenonic model.
- View Dependent Claims (25, 26)
- - 25. The method defined in claim 24 wherein the step of modifying the fenonic model comprises inserting a fenone into the fenonic model.
  - 26. The method defined in claim 24 wherein the step of modifying the fenonic model comprises deleting a fenone from the fenonic model.

27. An apparatus for constructing a hidden Markov model of a given speech event comprising the steps of:
- means for providing a sequence of phonetic models representing the given speech event;
  
  means for creating a plurality of fenones associated with acoustic vectors, wherein each of the plurality of fenones is a hidden Markov model representing an acoustic event;
  
  means for creating a fenonic model for said given speech event, wherein each of the sequence of phonetic models is replaced by at least one of the plurality of fenones, such that the fenonic model comprises a sequence of fenones; and
  
  means for modifying the fenonic model.

28. A speech processing apparatus comprising:
- a speech input device;
  
  an acoustic processor that produces a plurality of acoustic vectors in response to speech received by the speech input device;
  
  an alignment mechanism that aligns the plurality of acoustic vectors with phonetic Markov models; and
  
  a fenonic processor that constructs fenonic Hidden Markov Models (HMMs), wherein the fenonic processor creates a plurality of fenones associated with the plurality of acoustic vectors, where each of the plurality of fenones is a hidden Markov model representing an acoustic event, and creates a fenonic model for a given speech event by replacing each of a sequence of phonetic models representing the given speech event with at least one of the plurality of fenones, thereby creating the fenonic model as a sequence of fenones.
- View Dependent Claims (29, 30, 31)
- - 29. The apparatus defined in claim 28 wherein the fenonic processor modifies the fenonic model.
  - 30. The apparatus defined in claim 28 wherein the fenonic processor modifies the fenonic model by inserting a fenone into the fenonic model.
  - 31. The apparatus defined in claim 28 wherein the fenonic processor modifies the fenonic model by deleting a fenone from the fenonic model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Computer Incorporated (Apple Inc.)
Inventors
de Souza, Peter Vincent, Austin, Stephen Christopher
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
EDOUARD, PATRICK NESTOR

Application Number

US08/735,049
Time in Patent Office

532 Days
Field of Search

395/2.65, 395/2.64, 395/2.54, 391/2.91-2.95
US Class Current

704/256
CPC Class Codes

G10L 15/144 Training of HMMs

G10L 2015/025 Phonemes, fenemes or fenone...

Method and apparatus for constructing continuous parameter fenonic hidden markov models by replacing phonetic models with continous fenonic models

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

23 Citations

31 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for constructing continuous parameter fenonic hidden markov models by replacing phonetic models with continous fenonic models

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

23 Citations

31 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links