Method and apparatus for recognizing tone languages using pitch information
First Claim
Patent Images
1. A method for identifying toned vowels in words of speech comprising:
- converting the words of speech into an electrical signal;
generating spectral features from said electrical signal;
extracting pitch values from said electrical signal;
combining said spectral features and said pitch values into acoustic feature vectors;
comparing said acoustic feature vectors with prototypes of phonemes in an acoustic prototype database including prototypes of toned vowels to produce labels; and
matching said labels to text using a decoder comprising a phonetic vocabulary and a language model database.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and an apparatus for automatic recognition of tone languages, employing the steps of converting the words of speech into an electrical signal, generating spectral features from the electrical signal, extracting pitch values from the electrical signal, combining said spectral features and the pitch values into acoustic feature vectors, comparing the acoustic feature vectors with prototypes of phonemes in an acoustic prototype database including prototypes of toned vowels to produce labels, and matching the labels to text using a decoder comprising a phonetic vocabulary and a language model database.
40 Citations
19 Claims
-
1. A method for identifying toned vowels in words of speech comprising:
-
converting the words of speech into an electrical signal;
generating spectral features from said electrical signal;
extracting pitch values from said electrical signal;
combining said spectral features and said pitch values into acoustic feature vectors;
comparing said acoustic feature vectors with prototypes of phonemes in an acoustic prototype database including prototypes of toned vowels to produce labels; and
matching said labels to text using a decoder comprising a phonetic vocabulary and a language model database. - View Dependent Claims (2, 3, 4, 5, 6, 7)
preparing a training text from said words of speech;
transcribing said training text into sequences of phonemes including vowels with tones;
converting spoken utterances of said training text into an electrical signal;
generating spectral features from said electrical signal;
extracting pitch values from said electrical signal;
combining said spectral features and said pitch values into acoustic feature vectors;
comparing said acoustic feature vectors with said sequences of phonemes including vowels with tone to produce acoustic prototypes for each phoneme.
-
-
3. The method of claim 2, wherein said acoustic prototypes are stored in a database.
-
4. The method of claim 1, wherein said phonetic vocabulary comprises a database of words of speech including tone information.
-
5. The method of claim 1, wherein said language model database is used to determine a probability of a word.
-
6. The method of claim 1, wherein said words of speech comprise at least one syllable having tonal content.
-
7. The method of claim 6, wherein said toned vowel determines a tone of said syllable.
-
8. A program storage device readable by machine, tangibly embodying a program of instructions executable by machine to perform the method steps for identifying toned vowels in words of speech, the method comprising the steps of:
-
converting the words of speech into an electrical signal;
generating spectral features from said electrical signal;
extracting pitch values from said electrical signal;
combining said spectral features and said pitch values into acoustic feature vectors;
comparing said acoustic feature vectors with prototypes of phonemes in an acoustic prototype database including prototypes of toned vowels to produce labels; and
matching said labels to text using a decoder comprising a phonetic vocabulary and a language model database. - View Dependent Claims (9, 10, 11, 12, 13, 14)
receiving as input a training text from said words of speech;
transcribing said training text into sequences of phonemes including vowels with tones;
converting spoken utterances of said training text into an electrical signal;
generating spectral features from said electrical signal;
extracting pitch values from said electrical signal;
combining said spectral features and said pitch values into acoustic feature vectors;
comparing said acoustic feature vectors with said sequences of phonemes including vowels with tone to produce acoustic prototypes for each phoneme.
-
-
10. The program storage device of claim 9, wherein said acoustic prototypes are stored in a database.
-
11. The program storage device of claim 8, wherein said phonetic vocabulary comprises a database of words of speech including tone information.
-
12. The program storage device of claim 8, wherein said language model database is used to determine a probability of a word.
-
13. The program storage device of claim 8, wherein said words of speech comprise at least one syllable having tonal content.
-
14. The program storage device of claim 8, wherein said toned vowel determines a tone of said syllable.
-
15. A system for identifying toned vowels in words of speech, comprising:
-
means for converting the words of speech into an electrical signal;
means for generating spectral features from said electrical signal;
means for extracting pitch values from said electrical signal;
means for combining said spectral features and said pitch values into acoustic feature vectors;
means for comparing said acoustic feature vectors with prototypes of phonemes in an acoustic prototype database including prototypes of toned vowels to produce labels; and
means for matching said labels to text using a decoder comprising a phoneic vocabulary and a language model database. - View Dependent Claims (16, 17, 18, 19)
-
Specification