Method and apparatus for constructing continuous parameter fenonic hidden markov models by replacing phonetic models with continous fenonic models
First Claim
Patent Images
1. A method of constructing a hidden Markov model of a given speech event comprising the steps of:
- providing a sequence of phonetic models representing the given speech event;
creating a plurality of fenones associated with acoustic vectors, wherein each of the plurality of fenones is a hidden Markov model representing an acoustic event; and
creating a fenonic model for said given speech event, wherein each of the sequence of phonetic models is replaced by at least one of the plurality of fenones, such that the fenonic model comprises a sequence of fenones.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for constructing a hidden Markov model comprised of multiple fenones characterized by their duration and a set of acoustic properties. The present invention provides a sequence of fenones to model a speech event. The sequence may undergo modifications to improve the overall performance of the model.
23 Citations
31 Claims
-
1. A method of constructing a hidden Markov model of a given speech event comprising the steps of:
-
providing a sequence of phonetic models representing the given speech event; creating a plurality of fenones associated with acoustic vectors, wherein each of the plurality of fenones is a hidden Markov model representing an acoustic event; and creating a fenonic model for said given speech event, wherein each of the sequence of phonetic models is replaced by at least one of the plurality of fenones, such that the fenonic model comprises a sequence of fenones. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 10, 11)
-
-
9. A method for constructing a hidden Markov model for a speech event that is represented by a phonetic sequence, said method comprising the steps of:
-
creating a plurality of fenones in a one-to-one correspondence with each of a plurality of distinct phonetic arcs in a phonetic model, wherein plurality of fenones are associated with acoustic vectors; and creating a fenonic model for a given speech event, wherein the step of creating a fenonic model includes the steps of selecting a predetermined number of instances of said given speech event; and replacing each arc in the phonetic sequence by one of the plurality of fenones to create a sequence of fenones, wherein said one of the plurality of fenones is selected so as to maximize the joint likelihood of all vectors aligned with the arc in the predetermined number of samples, such that the fenonic model for the given speech event is produced.
-
-
15. A method for constructing a hidden Markov model for a speech event that is represented by a phonetic sequence, said method comprising the steps of:
-
creating a plurality of fenones in a one-to-one correspondence with each of a plurality of distinct phonetic arcs; and creating a fenonic model for a given speech event, wherein the step of creating a fenonic model includes the steps of selecting a predetermined number of instances of said given speech event; and replacing each arc in the phonetic sequence by one of the plurality of fenones to create a sequence of fenones, wherein said one of the plurality of fenones is selected so as to maximize the joint likelihood of all vectors aligned with the arc in the predetermined number of samples, such that the fenonic model with the sequence of fenones for the given speech event is produced; modifying the fenonic model. - View Dependent Claims (12, 13, 14, 16, 17, 18, 19, 20, 21, 23)
-
-
22. A method for constructing a Markov model comprising the steps of:
-
recording training data; processing the training data into a sequence of acoustic vectors; aligning the sequence of vectors with linear phonetic Markov models; creating a first alphabet of distinct phonetic arcs in the training data; for each arc in the first alphabet, extracting the acoustic vectors aligned with said each arc in the first alphabet; clustering the acoustic vectors aligned with said each arc to create a predetermined number of clusters; computing centroid and covariances matrices for each of said predetermined number of clusters; defining a second alphabet of a plurality of fenones, wherein the plurality of fenones are associated with acoustic vectors and each of the plurality of fenones is represented by an elementary HMM and is associated with a plurality of Gaussian distributions and a set of transition probabilities; and replacing each distinct arc in the phonetic sequence with one of the plurality of fenones, such that a fenonic model for the speech event is created.
-
-
24. A Markov model for a speech event created according to a method comprising the steps of:
-
providing a sequence of phonetic models representing the given speech event; creating a plurality of fenones, wherein each of the plurality of fenones is a hidden Markov model representing an acoustic event; and creating a fenonic model for said given speech event, wherein each of the sequence of phonetic models is replaced by at least one of the plurality of fenones, such that the fenonic model includes a sequence of fenones from the plurality of fenones; and modifying the fenonic model. - View Dependent Claims (25, 26)
-
-
27. An apparatus for constructing a hidden Markov model of a given speech event comprising the steps of:
-
means for providing a sequence of phonetic models representing the given speech event; means for creating a plurality of fenones associated with acoustic vectors, wherein each of the plurality of fenones is a hidden Markov model representing an acoustic event; means for creating a fenonic model for said given speech event, wherein each of the sequence of phonetic models is replaced by at least one of the plurality of fenones, such that the fenonic model comprises a sequence of fenones; and means for modifying the fenonic model.
-
-
28. A speech processing apparatus comprising:
-
a speech input device; an acoustic processor that produces a plurality of acoustic vectors in response to speech received by the speech input device; an alignment mechanism that aligns the plurality of acoustic vectors with phonetic Markov models; and a fenonic processor that constructs fenonic Hidden Markov Models (HMMs), wherein the fenonic processor creates a plurality of fenones associated with the plurality of acoustic vectors, where each of the plurality of fenones is a hidden Markov model representing an acoustic event, and creates a fenonic model for a given speech event by replacing each of a sequence of phonetic models representing the given speech event with at least one of the plurality of fenones, thereby creating the fenonic model as a sequence of fenones. - View Dependent Claims (29, 30, 31)
-
Specification