METHOD FOR BUILDING ACOUSTIC MODEL, SPEECH RECOGNITION METHOD AND ELECTRONIC APPARATUS
First Claim
1. A method for building an acoustic model, adapted to an electronic apparatus, the method comprising:
- receiving a plurality of speech signals;
receiving a plurality of phonetic transcriptions matching pronunciations in the speech signals; and
obtaining data of a plurality of phones corresponding to the phonetic transcriptions in the acoustic model by training according to the speech signals and the phonetic transcriptions.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for building acoustic model, a speech recognition method and an electronic apparatus are provided. The speech recognition method includes the following steps. A plurality of phonetic transcriptions of a speech signal is obtained from an acoustic model. A plurality of vocabularies matching the phonetic transcriptions are obtained according to each phonetic transcription and a syllable acoustic lexicon, wherein the syllable acoustic lexicon includes the vocabularies corresponding to the phonetic transcription, and the vocabulary having at least one phonetic transcription includes a code corresponding to the phonetic transcription. A plurality of strings and a plurality of string probabilities are obtained from a language model according to the code of each of the vocabularies.
-
Citations
32 Claims
-
1. A method for building an acoustic model, adapted to an electronic apparatus, the method comprising:
-
receiving a plurality of speech signals; receiving a plurality of phonetic transcriptions matching pronunciations in the speech signals; and obtaining data of a plurality of phones corresponding to the phonetic transcriptions in the acoustic model by training according to the speech signals and the phonetic transcriptions. - View Dependent Claims (2)
-
-
3. A speech recognition method, adapted to an electronic apparatus, comprising:
-
obtaining a plurality of phonetic transcriptions of a speech signal according to an acoustic model, and the phonetic transcriptions including a plurality of phones; obtaining a plurality of vocabularies matching the phonetic transcriptions and obtaining a fuzzy sound probability of the phonetic transcription matching each of the vocabularies according to each of the phonetic transcriptions and a syllable acoustic lexicon; and selecting the vocabulary corresponding to a largest one among the fuzzy sound probabilities to be used as the vocabularies matching the speech signal. - View Dependent Claims (4, 5, 6, 7)
-
-
8. A speech recognition method, adapted to an electronic apparatus, comprising:
-
obtaining a plurality of phonetic transcriptions of the speech signal according to an acoustic model, and the phonetic transcriptions including a plurality of phones; obtaining a plurality of vocabularies matching the phonetic transcriptions according to each of the phonetic transcriptions and a syllable acoustic lexicon, wherein the syllable acoustic lexicon comprises the vocabularies corresponding to the phonetic transcriptions, and the vocabulary having at least one phonetic transcription comprises each of codes corresponding to each of the phonetic transcriptions; obtaining a plurality of strings and a plurality of string probabilities from a language model according to the code of each of the vocabularies; and selecting the string corresponding to a largest one among the string probabilities as a recognition result of the speech signal. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. An electronic apparatus, comprising:
-
an input unit, receiving a plurality of speech signals; a storage unit, storing a plurality of program code segments; and a processing unit, coupled to the input unit and the storage unit, the processing unit executing a plurality of commands through the program code segments, and the commands comprising; receiving a plurality of phonetic transcriptions matching pronunciations in the speech signals; and obtaining data of a plurality of phones corresponding to the phonetic transcriptions in the acoustic model by training according to the speech signals and the phonetic transcriptions. - View Dependent Claims (18)
-
-
19. An electronic apparatus, comprising:
-
an input unit, receiving a speech signal; a storage unit, storing a plurality of program code segments; and a processing unit, coupled to the input unit and the storage unit, the processing unit executing a plurality of commands through the program code segments, and the commands comprising; obtaining a plurality of phonetic transcriptions of the speech signal according to an acoustic model, and the phonetic transcriptions including a plurality of phones; obtaining a plurality of vocabularies matching the phonetic transcriptions and obtaining a fuzzy sound probability of the phonetic transcription matching each of the vocabularies according to each of the phonetic transcriptions and a syllable acoustic lexicon; and selecting the vocabulary corresponding to a largest one among the fuzzy sound probabilities to be used as the vocabularies matching the speech signal. - View Dependent Claims (20, 21, 22, 23)
-
-
24. An electronic apparatus, comprising:
-
an input unit, receiving a speech signal; a storage unit, storing a plurality of program code segments; and a processing unit, coupled to the input unit and the storage unit, the processing unit executing a plurality of commands through the program code segments, and the commands comprising; obtaining a plurality of phonetic transcriptions of the speech signal according to an acoustic model, and the phonetic transcriptions including a plurality of phones; obtaining a plurality of vocabularies matching the phonetic transcriptions according to each of the phonetic transcriptions and a syllable acoustic lexicon, wherein the syllable acoustic lexicon comprises the vocabularies corresponding to the phonetic transcriptions, and the vocabulary having at least one phonetic transcription comprises each of codes corresponding to each of the phonetic transcriptions; obtaining a plurality of strings and a plurality of string probabilities from a language model according to the code of each of the vocabularies; and selecting the string corresponding to a largest one among the string probabilities as a recognition result of the speech signal. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32)
-
Specification