Decoding multiple HMM sets using a single sentence grammar
First Claim
Patent Images
1. A speech recognizer for decoding multiple HMM sets using a generic base sentence network comprising:
- means for decoding HMM sets using the generic base sentence network wherein each HMM set of said HMM sets is a group of HMMs from one environment and a recognizer responsive to input speech for recognizing speech using said decoded multiple HMM sets wherein the means for decoding includes means for building recognition paths defined on expanded symbols and accessing said network using base symbols through a conversion function that gives the base symbol of any expanded symbols, and vice versa.
1 Assignment
0 Petitions
Accused Products
Abstract
For a given sentence grammar, speech recognizers are often required to decode M sets of HMMs each of which models a specific acoustic environment. In order to match input acoustic observations to each of the environments, typically recognition search methods require a network of M sub-networks. A new speech recognition search method is described here, which needs that is only the size of a single sub-network and yet gives the same recognition performance, thus reducing memory requirement for network storage by (M−1)/M.
9 Citations
15 Claims
-
1. A speech recognizer for decoding multiple HMM sets using a generic base sentence network comprising:
- means for decoding HMM sets using the generic base sentence network wherein each HMM set of said HMM sets is a group of HMMs from one environment and a recognizer responsive to input speech for recognizing speech using said decoded multiple HMM sets wherein the means for decoding includes means for building recognition paths defined on expanded symbols and accessing said network using base symbols through a conversion function that gives the base symbol of any expanded symbols, and vice versa.
-
2. A speech recognition device comprising:
- a speech recognizer processing device responsive to input speech and HMM sets for recognizing speech and an output device for presenting the recognized speech to a user, said recognizer processing device comprising means for decoding multiple HMM sets using a generic base sentence network wherein each HMM set of said HMM sets is a group of HMMs from one environment comprising the steps of;
providing a generic grammar, providing expanded symbols representing a network of expanded HMM sets and building recognition paths defined by the expanded symbols and accessing the generic base sentence network using base symbols through a proper conversion function that gives the base symbol of any expanded symbols, and vice versa. - View Dependent Claims (3, 4, 5)
- a speech recognizer processing device responsive to input speech and HMM sets for recognizing speech and an output device for presenting the recognized speech to a user, said recognizer processing device comprising means for decoding multiple HMM sets using a generic base sentence network wherein each HMM set of said HMM sets is a group of HMMs from one environment comprising the steps of;
-
6. In a speech recognition device including a recognizer processing device coupled to receive input speech and HMM sets for recognizing speech and an output device for presenting the recognized speech to a user, a method in said recognizer processing device for decoding multiple HMM sets using a generic base sentence network wherein each HMM set of said HMM sets is a group of HMMs from one environment comprising the steps of:
- providing a generic network containing base symbols;
a plurality of sets of HMMs where each set of HMMs corresponds to a single environmental factor such as for male and female;
each said set of HMMs enumerated in terms of expanded symbols which map to the generic network base symbols;
accessing said generic network using said base symbols through a conversion function that gives base symbols for expanded symbols to therefore decode multiple HMM sets using a generic base sentence grammar and using said HMM sets to recognize incoming speech.
- providing a generic network containing base symbols;
-
7. A speech recognizer comprising:
- an input means for receiving speech, a recognizer processing device recognizing speech using multiple HMM sets and provide output signals representing recognized speech, said recognizer processing device including means for decoding said multiple HMM sets using a generic base sentence network wherein the means for decoding includes means for building recognition paths defined on expanded symbols and accessing said network using base symbols through a conversion function that gives the base symbol of any expanded symbols, and vice versa the processing steps of a main loop, path-propagation, update-observation-probability, within-model path, and cross-model path build extensions to the recognition paths by calculating Δ
hmms in the processing steps get-offsets and get-true-symbols which interface between the generic base network object and the multiple environment HMM sets.
- an input means for receiving speech, a recognizer processing device recognizing speech using multiple HMM sets and provide output signals representing recognized speech, said recognizer processing device including means for decoding said multiple HMM sets using a generic base sentence network wherein the means for decoding includes means for building recognition paths defined on expanded symbols and accessing said network using base symbols through a conversion function that gives the base symbol of any expanded symbols, and vice versa the processing steps of a main loop, path-propagation, update-observation-probability, within-model path, and cross-model path build extensions to the recognition paths by calculating Δ
-
8. A speech recognizer comprising in combination:
- means for receiving input speech, a recognizer processing device including model sets and means for comparing input speech to said model sets to recognize said input speech and an output device for presenting the recognized speech to a user, said recognizer processing device including means for decoding a plurality of model sets using a generic base grammar network composed of base-symbols wherein each model set of said model sets is a group of models from one environment comprising;
means for constructing recognition paths defined on expanded-symbols wherein each expanded-symbol references a model contained in one of the model sets, and means for determining expanded-symbols by a conversion function that maps a base-symbol of the generic base grammar network to a plurality of expanded-symbols and an expanded-symbol to its corresponding base-symbol. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
- means for receiving input speech, a recognizer processing device including model sets and means for comparing input speech to said model sets to recognize said input speech and an output device for presenting the recognized speech to a user, said recognizer processing device including means for decoding a plurality of model sets using a generic base grammar network composed of base-symbols wherein each model set of said model sets is a group of models from one environment comprising;
Specification