Decoding multiple HMM sets using a single sentence grammar

US 7,269,558 B2
Filed: 07/26/2001
Issued: 09/11/2007
Est. Priority Date: 07/31/2000
Status: Expired due to Term

First Claim

Patent Images

1. A speech recognizer for decoding multiple HMM sets using a generic base sentence network comprising:

means for decoding HMM sets using the generic base sentence network wherein each HMM set of said HMM sets is a group of HMMs from one environment and a recognizer responsive to input speech for recognizing speech using said decoded multiple HMM sets wherein the means for decoding includes means for building recognition paths defined on expanded symbols and accessing said network using base symbols through a conversion function that gives the base symbol of any expanded symbols, and vice versa.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

For a given sentence grammar, speech recognizers are often required to decode M sets of HMMs each of which models a specific acoustic environment. In order to match input acoustic observations to each of the environments, typically recognition search methods require a network of M sub-networks. A new speech recognition search method is described here, which needs that is only the size of a single sub-network and yet gives the same recognition performance, thus reducing memory requirement for network storage by (M−1)/M.

9 Citations

View as Search Results

15 Claims

1. A speech recognizer for decoding multiple HMM sets using a generic base sentence network comprising:
- means for decoding HMM sets using the generic base sentence network wherein each HMM set of said HMM sets is a group of HMMs from one environment and a recognizer responsive to input speech for recognizing speech using said decoded multiple HMM sets wherein the means for decoding includes means for building recognition paths defined on expanded symbols and accessing said network using base symbols through a conversion function that gives the base symbol of any expanded symbols, and vice versa.

2. A speech recognition device comprising:
- a speech recognizer processing device responsive to input speech and HMM sets for recognizing speech and an output device for presenting the recognized speech to a user, said recognizer processing device comprising means for decoding multiple HMM sets using a generic base sentence network wherein each HMM set of said HMM sets is a group of HMMs from one environment comprising the steps of;
  
  providing a generic grammar, providing expanded symbols representing a network of expanded HMM sets and building recognition paths defined by the expanded symbols and accessing the generic base sentence network using base symbols through a proper conversion function that gives the base symbol of any expanded symbols, and vice versa.
- View Dependent Claims (3, 4, 5)
- - 3. The method of claim 2 wherein said building step includes for each frame path propagation expansion within each expanded HMM set based on the grammar network and update-observation-probability.
  - 4. The method of claim 3 wherein said path propagation includes getting offsets that index each HMM set, retrieving individual expanded symbols for each HMM set that correspond to base symbols within the generic grammar network, and extending a Viterbi search for each expanded symbol for each HMM set individually and separately by obtaining the HMM of the previous frame and expanding and storing a sequence set of HMM states both for within model path and cross model path and determining the path with the best transition probability.
  - 5. The method of claim 2 wherein the processing steps of a main loop, path-propagation, update-observation-probability, within-model path, and cross-model path build extensions to the recognition paths by calculating Δ
    - _hmmsin the processing steps get-offsets and get-true-symbols which interface between the generic base network object and the multiple environment HMM sets.

6. In a speech recognition device including a recognizer processing device coupled to receive input speech and HMM sets for recognizing speech and an output device for presenting the recognized speech to a user, a method in said recognizer processing device for decoding multiple HMM sets using a generic base sentence network wherein each HMM set of said HMM sets is a group of HMMs from one environment comprising the steps of:
- providing a generic network containing base symbols;
  
  a plurality of sets of HMMs where each set of HMMs corresponds to a single environmental factor such as for male and female;
  
  each said set of HMMs enumerated in terms of expanded symbols which map to the generic network base symbols;
  
  accessing said generic network using said base symbols through a conversion function that gives base symbols for expanded symbols to therefore decode multiple HMM sets using a generic base sentence grammar and using said HMM sets to recognize incoming speech.

7. A speech recognizer comprising:
- an input means for receiving speech, a recognizer processing device recognizing speech using multiple HMM sets and provide output signals representing recognized speech, said recognizer processing device including means for decoding said multiple HMM sets using a generic base sentence network wherein the means for decoding includes means for building recognition paths defined on expanded symbols and accessing said network using base symbols through a conversion function that gives the base symbol of any expanded symbols, and vice versa the processing steps of a main loop, path-propagation, update-observation-probability, within-model path, and cross-model path build extensions to the recognition paths by calculating Δ
  
  _hmmsin the processing steps get-offsets and get-true-symbols which interface between the generic base network object and the multiple environment HMM sets.

8. A speech recognizer comprising in combination:
- means for receiving input speech, a recognizer processing device including model sets and means for comparing input speech to said model sets to recognize said input speech and an output device for presenting the recognized speech to a user, said recognizer processing device including means for decoding a plurality of model sets using a generic base grammar network composed of base-symbols wherein each model set of said model sets is a group of models from one environment comprising;
  
  means for constructing recognition paths defined on expanded-symbols wherein each expanded-symbol references a model contained in one of the model sets, and means for determining expanded-symbols by a conversion function that maps a base-symbol of the generic base grammar network to a plurality of expanded-symbols and an expanded-symbol to its corresponding base-symbol.
- View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
- - 9. The recognizer of claim 8 wherein said recognition path construction includes means for constraining each recognition path to expanded-symbols referencing models within one model set.
  - 10. The recognizer of claim 8 wherein the model sets are HMM model sets.
  - 11. The recognizer of claim 8 wherein the models of each set correspond to a single environmental factor.
  - 12. The recognizer of claim 8 wherein the recognition procedure consists of a recognition path construction procedure and an update observation probability procedure.
  - 13. The recognition path construction procedure of claim 12 wherein construction of recognition paths consists of extending the path, wherein the path defined by a present expanded-symbol and its referenced model is extended within the referenced model by a within-model-path procedure and to additional expanded-symbols by a cross-model-path procedure.
  - 14. The cross-model-path procedure of claim 13 in which the path from a present expanded-symbol is extended to additional expanded-symbols by determining the present base-symbol corresponding to the present expanded-symbol, determining which additional base-symbols may follow the present-base symbol according to the generic base grammar network, and determining the additional expanded-symbols using the conversion function which maps each additional base-symbol of the generic base grammar network to a plurality of additional expanded-symbols.
  - 15. The update observation probability procedure of claim 12 in which the probability of speech is included in the extended recognition paths for each of the models corresponding to expanded-symbols on the recognition paths.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Texas Instruments, Inc.
Original Assignee
Texas Instruments, Inc.
Inventors
Gong, Yifan
Primary Examiner(s)
CHAWAN, VIJAY B

Application Number

US09/915,911
Publication Number

US 20020042710A1
Time in Patent Office

2,238 Days
Field of Search

704255-257, 704/231, 704/232, 704/251, 704240-243
US Class Current

704/256.2
CPC Class Codes

G10L 15/142 Hidden Markov Models [HMMs]

Decoding multiple HMM sets using a single sentence grammar

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

9 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Decoding multiple HMM sets using a single sentence grammar

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

9 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links