Methodology for generating enhanced demiphone acoustic models for speech recognition
First Claim
1. A system for implementing a speech recognition engine, comprising:
- demiphone acoustic models that said speech recognition engine utilizes to perform speech recognition procedures, said demiphone acoustic models each having three states that collectively form a preceding demiphone and a succeeding demiphone; and
an acoustic model generator that analyzes speech context information to configure each of said demiphone acoustic models as either a succeeding-dominant demiphone acoustic model or a preceding-dominant dominant demiphone acoustic model, a contextual dominance for each demiphone state from a given one of said demiphone acoustic models being determined by analyzing predominant contextual information in a triphone decision tree corresponding to said each demiphone state.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for effectively performing speech recognition procedures includes enhanced demiphone acoustic models that a speech recognition engine utilizes to perform the speech recognition procedures. The enhanced demiphone acoustic models each have three states that are collectively arranged to form a preceding demiphone and a succeeding demiphone. An acoustic model generator may utilize a decision tree for analyzing speech context information from a training database. The acoustic model generator then effectively configures each of the enhanced demiphone acoustic models as either a succeeding-dominant enhanced demiphone acoustic model or a preceding-dominant enhanced demiphone acoustic model to accurately model speech characteristics.
18 Citations
14 Claims
-
1. A system for implementing a speech recognition engine, comprising:
-
demiphone acoustic models that said speech recognition engine utilizes to perform speech recognition procedures, said demiphone acoustic models each having three states that collectively form a preceding demiphone and a succeeding demiphone; and an acoustic model generator that analyzes speech context information to configure each of said demiphone acoustic models as either a succeeding-dominant demiphone acoustic model or a preceding-dominant dominant demiphone acoustic model, a contextual dominance for each demiphone state from a given one of said demiphone acoustic models being determined by analyzing predominant contextual information in a triphone decision tree corresponding to said each demiphone state.
-
-
2. A system for implementing a speech recognition engine, comprising:
-
demiphone acoustic models that said speech recognition engine utilizes to perform speech recognition procedures, said demiphone acoustic models each having three states that collectively form a preceding demiphone and a succeeding demiphone; and an acoustic model generator that analyzes speech context information to configure each of said demiphone acoustic models as either a succeeding-dominant demiphone acoustic model or a preceding-dominant demiphone acoustic model, said speech context information being identified by decision trees that each include a series of questions, said questions each corresponding to a different acoustic speech characteristic, said questions each also being used to identify a contextual dominance characteristic corresponding to said different acoustic speech characteristic. - View Dependent Claims (3, 4, 5, 6, 7)
-
-
8. A method for implementing a speech recognition engine, comprising:
-
utilizing demiphone acoustic models to perform speech recognition procedures, each of said demiphone acoustic models having three states that collectively form a preceding demiphone and a succeeding demiphone; and analyzing speech context information with an acoustic model generator to configure each of said demiphone acoustic models as either a succeeding-dominant demiphone acoustic model or a preceding-dominant demiphone acoustic model, a contextual dominance for each demiphone state from a given one of said demiphone acoustic models being determined by analyzing predominant contextual information in a triphone decision tree corresponding to said each demiphone state.
-
-
9. A method for implementing a speech recognition engine, comprising:
-
utilizing demiphone acoustic models to perform speech recognition procedures, each of said demiphone acoustic models having three states that collectively form a preceding demiphone and a succeeding demiphone; and analyzing speech context information with an acoustic model generator to configure each of said demiphone acoustic models as either a succeeding-dominant demiphone acoustic model or a preceding-dominant demiphone acoustic model, said speech context information being identified by decision trees that each include a series of questions, said questions each corresponding to a different acoustic speech characteristic, said questions each also being used to identify a contextual dominance characteristic corresponding to said different acoustic speech characteristic. - View Dependent Claims (10, 11, 12, 13, 14)
-
Specification