Segmental tonal modeling for tonal languages

US 7,684,987 B2
Filed: 01/21/2004
Issued: 03/23/2010
Est. Priority Date: 01/21/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A speech processing system receiving an input related to one of speech and text and process the input to provide an output related to one of speech and text, the speech processing system comprising:

a module derived from a phone set having a plurality of phones for a tonal language, wherein the tonal language comprises a plurality of different tones with different levels of pitch, the phones being used to model syllables used in the module, the syllables having an initial part and a final part, wherein at least some of the syllables of the tonal language include a glide, the glide being embodied in the initial part, and wherein the final part comprises a first temporal portion corresponding to a first relative pitch and a second temporal portion corresponding to a second relative pitch, wherein the first portion and the second portion jointly and implicitly carry tonal information, and wherein the different levels of pitch comprise at least two discrete categorical levels, and wherein each portion has a discrete categorical level associated with it; and

a processor configured to receive an input related to one of speech and text and access the module to process the input to provide an output related to one of speech and text.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A phone set for use in speech processing such as speech recognition or text-to-speech conversion is used to model or form syllables of a tonal language having a plurality of different tones. Each syllable includes an initial part that can be glide dependent and a final part. The final part includes a plurality of phones. Each phones carries partial tonal information such that the phones taken together implicitly and jointly represent the different tones.

14 Citations

View as Search Results

21 Claims

1. A speech processing system receiving an input related to one of speech and text and process the input to provide an output related to one of speech and text, the speech processing system comprising:
- a module derived from a phone set having a plurality of phones for a tonal language, wherein the tonal language comprises a plurality of different tones with different levels of pitch, the phones being used to model syllables used in the module, the syllables having an initial part and a final part, wherein at least some of the syllables of the tonal language include a glide, the glide being embodied in the initial part, and wherein the final part comprises a first temporal portion corresponding to a first relative pitch and a second temporal portion corresponding to a second relative pitch, wherein the first portion and the second portion jointly and implicitly carry tonal information, and wherein the different levels of pitch comprise at least two discrete categorical levels, and wherein each portion has a discrete categorical level associated with it; and
  
  a processor configured to receive an input related to one of speech and text and access the module to process the input to provide an output related to one of speech and text.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The speech processing system of claim 1 wherein the different levels of pitch comprise three categorical levels, and wherein each portion has a discrete categorical level associated with it.
  - 3. The speech processing system of claim 1 wherein the different levels of pitch comprise five categorical levels, and wherein each portion has a discrete categorical level associated with it.
  - 4. The speech processing system of claim 1 wherein the speech processing system comprises one of a speech recognition system and a text-to-speech converter.
  - 5. The speech processing system of claim 4 wherein the different levels of pitch comprise two categorical levels, and wherein each portion has a discrete categorical level associated with it.
  - 6. The speech processing system of claim 4 wherein the different levels of pitch comprise three categorical levels, and wherein each portion has a discrete categorical level associated with it.
  - 7. The speech processing system of claim 4 wherein the different levels of pitch comprise five categorical levels, and wherein each portion has a discrete categorical level associated with it.
  - 8. The speech processing system of claim 1 wherein the tonal language comprises Chinese or a dialect thereof, such as Cantonese.
  - 9. The speech processing system of claim 1 wherein the tonal language comprises Thai or a tonal dialect thereof.
  - 10. The speech processing system of claim 1 wherein the tonal language comprises Vietnamese or a tonal dialect thereof.

11. A speech processing system receiving an input related to one of speech and text and process the input to perform one of speech recognition and text-to-speech conversion in order to provide an output related to one of speech and text, the speech processing system comprising:
- a module derived from a phone set having a plurality of phones for a tonal language comprising a plurality of different tones with different levels of pitch, the phones being used to model syllables used in the module, at least some of the syllables having an initial part and final part, wherein a first set of the plurality of phones are used to describe the glide dependent initial part, and a second set of the plurality of phones are used to describe the final part, wherein the final part comprises a first temporal phone corresponding to a first relative pitch and a second temporal phone corresponding to a second relative pitch, and wherein the different levels of pitch comprise at least two discrete categorical levels, and wherein each phone has a discrete categorical level associated with it; and
  
  a processor configured to receive an input related to one of speech and text and access the module to process the input to provide an output related to one of speech and text.
- View Dependent Claims (12, 13, 14)
- - 12. The speech processing system of claim 11 wherein the different levels of pitch comprise three categorical levels, and wherein each phone has a discrete categorical level associated with it.
  - 13. The speech processing system of claim 11 wherein the different levels of pitch comprise five categorical levels, and wherein each phone has a discrete categorical level associated with it.
  - 14. The speech processing system of claim 11 wherein at least one syllable comprises only the final part having two phones carrying partial tonal information each.

15. A computer readable storage media having instructions, which when implemented on a computing device perform speech processing comprising:
- accessing a module having a phone set comprising a plurality of phones for a tonal language;
  
  wherein the tonal language comprises a plurality of different tones with different levels of pitch;
  
  the phones being used to model syllables, the syllables having an initial part and final part;
  
  wherein at least some of the syllables of the tonal language include a glide, the glide being embodied in the initial part; and
  
  wherein the final part comprises a first temporal phone corresponding to a first relative pitch and a second temporal phone corresponding to a second relative pitch;
  
  wherein the first and second phones jointly and implicitly carry tonal information; and
  
  wherein the different levels of pitch comprise at least two discrete categorical levels, and wherein each phone has a discrete categorical level associated with it;
  
  utilizing the phone set to identify syllables corresponding to the input for performing one of speech recognition and text-to-speech conversion; and
  
  providing an output corresponding to one of speech recognition and text-to-speech conversion.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. The computer readable storage media of claim 15 wherein the different levels of pitch comprise three categorical levels, and wherein each portion has a discrete categorical level associated with it.
  - 17. The computer readable storage media of claim 15 wherein the different levels of pitch comprise five categorical levels, and wherein each portion has a discrete categorical level associated with it.
  - 18. The computer readable storage media of claim 15 wherein the speech processing system comprises one of a speech recognition system and a text-to-speech converter.
  - 19. The computer readable storage media of claim 18 wherein the different levels of pitch comprise two categorical levels, and wherein each portion has a discrete categorical level associated with it.
  - 20. The computer readable storage media of claim 18 wherein the different levels of pitch comprise three categorical levels, and wherein each portion has a discrete categorical level associated with it.
  - 21. The computer readable storage media of claim 18 wherein the different levels of pitch comprise five categorical levels, and wherein each portion has a discrete categorical level associated with it.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Huang, Chao, Chu, Min
Primary Examiner(s)
Hudspeth; David R
Assistant Examiner(s)
SHAH, PARAS D

Application Number

US10/762,060
Publication Number

US 20050159954A1
Time in Patent Office

2,253 Days
Field of Search

704/251, 704/254, 704/231
US Class Current

704/254
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 2015/027   Syllables being the recogni...

G10L 25/15   the extracted parameters be...

Segmental tonal modeling for tonal languages

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

14 Citations

21 Claims

Specification

Use Cases

Quick Links

Others

Segmental tonal modeling for tonal languages

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

14 Citations

21 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others