Methodology for generating enhanced demiphone acoustic models for speech recognition
First Claim
1. A system for implementing a speech recognition engine, comprising:
- demiphone acoustic models that said speech recognition engine utilizes to perform speech recognition procedures, said demiphone acoustic models each having three states that collectively form a preceding demiphone and a succeeding demiphone; and
an acoustic model generator that analyzes speech context information to configure each of said demiphone acoustic models as either a succeeding-dominant demiphone acoustic model or a preceding-dominant demiphone acoustic model.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for effectively performing speech recognition procedures includes enhanced demiphone acoustic models that a speech recognition engine utilizes to perform the speech recognition procedures. The enhanced demiphone acoustic models each have three states that are collectively arranged to form a preceding demiphone and a succeeding demiphone. An acoustic model generator may utilize a decision tree for analyzing speech context information from a training database. The acoustic model generator then effectively configures each of the enhanced demiphone acoustic models as either a succeeding-dominant enhanced demiphone acoustic model or a preceding-dominant enhanced demiphone acoustic model to accurately model speech characteristics.
-
Citations
43 Claims
-
1. A system for implementing a speech recognition engine, comprising:
-
demiphone acoustic models that said speech recognition engine utilizes to perform speech recognition procedures, said demiphone acoustic models each having three states that collectively form a preceding demiphone and a succeeding demiphone; and
an acoustic model generator that analyzes speech context information to configure each of said demiphone acoustic models as either a succeeding-dominant demiphone acoustic model or a preceding-dominant demiphone acoustic model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method for implementing a speech recognition engine, comprising:
-
utilizing demiphone acoustic models to perform speech recognition procedures, each of said demiphone acoustic models having three states that collectively form a preceding demiphone and a succeeding demiphone; and
analyzing speech context information with an acoustic model generator to configure each of said demiphone acoustic models as either a succeeding-dominant demiphone acoustic model or a preceding-dominant demiphone acoustic model. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
-
-
41. A system for implementing a speech recognition engine, comprising:
-
means for performing speech recognition procedures, said means for performing speech recognition procedures each having three states that collectively form a preceding demiphone and a succeeding demiphone; and
means for configuring each of said means for performing speech recognition procedures as either a succeeding-dominant demiphone acoustic model or a preceding-dominant demiphone acoustic model.
-
-
42. A system for implementing a speech recognition engine, comprising:
-
demiphone acoustic models that each have three states that collectively form a succeeding demiphone and a preceding demiphone, said demiphone acoustic models all being configured in a succeeding-dominant configuration that has a first state forming said preceding demiphone, said succeeding-dominant configuration also having a second state and a third state forming said succeeding demiphone; and
a speech recognition engine that utilizes said demiphone acoustic models to perform speech recognition procedures.
-
-
43. An electronic device comprising:
-
an electronic data processor; and
a speech recognition engine implemented by the electronic data processor;
wherein the speech recognition engine comprises acoustic models, each acoustic model having three states, the three states being used to form a first demiphone and a second demiphone;
wherein the first demiphone is based on a speech element immediately preceding a speech element being modeled, and the second demiphone is based on a speech element immediately succeeding the speech element being modeled;
wherein for at least one of the acoustic models, the first demiphone is based on a first of the states and the second demiphone is based on the remaining two of the states; and
wherein for at least one of the acoustic models, the first demiphone is based on two of the states and the second demiphone is based on the remaining one of the states.
-
Specification