Segmental tonal modeling for tonal languages
First Claim
Patent Images
1. A speech processing system receiving an input related to one of speech and text and process the input to provide an output related to one of speech and text, the speech processing system comprising:
- a module derived from a phone set having a plurality of phones for a tonal language, wherein the tonal language comprises a plurality of different tones with different levels of pitch, the phones being used to model syllables used in the module, the syllables having an initial part and a final part, wherein at least some of the syllables of the tonal language include a glide, the glide being embodied in the initial part, and wherein the final part comprises a first temporal portion corresponding to a first relative pitch and a second temporal portion corresponding to a second relative pitch, wherein the first portion and the second portion jointly and implicitly carry tonal information, and wherein the different levels of pitch comprise at least two discrete categorical levels, and wherein each portion has a discrete categorical level associated with it; and
a processor configured to receive an input related to one of speech and text and access the module to process the input to provide an output related to one of speech and text.
2 Assignments
0 Petitions
Accused Products
Abstract
A phone set for use in speech processing such as speech recognition or text-to-speech conversion is used to model or form syllables of a tonal language having a plurality of different tones. Each syllable includes an initial part that can be glide dependent and a final part. The final part includes a plurality of phones. Each phones carries partial tonal information such that the phones taken together implicitly and jointly represent the different tones.
14 Citations
21 Claims
-
1. A speech processing system receiving an input related to one of speech and text and process the input to provide an output related to one of speech and text, the speech processing system comprising:
-
a module derived from a phone set having a plurality of phones for a tonal language, wherein the tonal language comprises a plurality of different tones with different levels of pitch, the phones being used to model syllables used in the module, the syllables having an initial part and a final part, wherein at least some of the syllables of the tonal language include a glide, the glide being embodied in the initial part, and wherein the final part comprises a first temporal portion corresponding to a first relative pitch and a second temporal portion corresponding to a second relative pitch, wherein the first portion and the second portion jointly and implicitly carry tonal information, and wherein the different levels of pitch comprise at least two discrete categorical levels, and wherein each portion has a discrete categorical level associated with it; and a processor configured to receive an input related to one of speech and text and access the module to process the input to provide an output related to one of speech and text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A speech processing system receiving an input related to one of speech and text and process the input to perform one of speech recognition and text-to-speech conversion in order to provide an output related to one of speech and text, the speech processing system comprising:
-
a module derived from a phone set having a plurality of phones for a tonal language comprising a plurality of different tones with different levels of pitch, the phones being used to model syllables used in the module, at least some of the syllables having an initial part and final part, wherein a first set of the plurality of phones are used to describe the glide dependent initial part, and a second set of the plurality of phones are used to describe the final part, wherein the final part comprises a first temporal phone corresponding to a first relative pitch and a second temporal phone corresponding to a second relative pitch, and wherein the different levels of pitch comprise at least two discrete categorical levels, and wherein each phone has a discrete categorical level associated with it; and a processor configured to receive an input related to one of speech and text and access the module to process the input to provide an output related to one of speech and text. - View Dependent Claims (12, 13, 14)
-
-
15. A computer readable storage media having instructions, which when implemented on a computing device perform speech processing comprising:
-
accessing a module having a phone set comprising a plurality of phones for a tonal language;
wherein the tonal language comprises a plurality of different tones with different levels of pitch;
the phones being used to model syllables, the syllables having an initial part and final part;
wherein at least some of the syllables of the tonal language include a glide, the glide being embodied in the initial part; and
wherein the final part comprises a first temporal phone corresponding to a first relative pitch and a second temporal phone corresponding to a second relative pitch;
wherein the first and second phones jointly and implicitly carry tonal information; and
wherein the different levels of pitch comprise at least two discrete categorical levels, and wherein each phone has a discrete categorical level associated with it;utilizing the phone set to identify syllables corresponding to the input for performing one of speech recognition and text-to-speech conversion; and providing an output corresponding to one of speech recognition and text-to-speech conversion. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification