High quality concatenative reading system

US 5,878,393 A
Filed: 09/09/1996
Issued: 03/02/1999
Est. Priority Date: 09/09/1996
Status: Expired due to Fees

First Claim

Patent Images

1. A high quality concatenative reading system for converting an input string into a sequence for audible synthesis, comprising:

a dictionary of complete word speech samples corresponding to entire words stored in a computer-readable medium;

a word list generator receptive of said input string for building and storing word list tokens in a word list, the word list generator building said word list from words stored in said dictionary that correspond to the input string;

said word list generator further having a list of prosodic environment tokens representing a plurality of intonation types, said word list generator assigning at least one of said prosodic environment tokens to at least some of the word list tokens;

phonological feature analyzer that analyzes said word list tokens and said assigned prosodic environment tokens and selects said complete word speech samples from said dictionary to build a sample list based on (a) the word list tokens, (b) the prosodic environment tokens and (c) the phonological features of adjacent words; and

output for concatenatively supplying said sample list to an analog conversion unit to produce an audible text-to-speech signal.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Computer-stored text, such as numerical information, is processed by a word list generator to develop a word list corresponding to those words that are to be spoken by the system. The word list generator assigns a prosodic environment state or token to each entry in the list. The prosodic environment identifies how the word functions in its current prosodic context. Different intonations are applied based on the prosodic environment. Next, the preceding and adjacent words are examined to determine how each word may need to be pronounced differently, based on the ending phoneme of the preceding word and the beginning phoneme of the following word. Using this phonological information along with the prosodic information, a sample list is generated by accessing a dictionary of stored samples. The sample list is then serially played through suitable digital-to-analog conversion circuitry to generate the text-to-speech output. The result is a natural, human-like reading, complete with appropriate intonation changes suitable to the context of the text material.

199 Citations

12 Claims

1. A high quality concatenative reading system for converting an input string into a sequence for audible synthesis, comprising:
- a dictionary of complete word speech samples corresponding to entire words stored in a computer-readable medium;
  
  a word list generator receptive of said input string for building and storing word list tokens in a word list, the word list generator building said word list from words stored in said dictionary that correspond to the input string;
  
  said word list generator further having a list of prosodic environment tokens representing a plurality of intonation types, said word list generator assigning at least one of said prosodic environment tokens to at least some of the word list tokens;
  
  phonological feature analyzer that analyzes said word list tokens and said assigned prosodic environment tokens and selects said complete word speech samples from said dictionary to build a sample list based on (a) the word list tokens, (b) the prosodic environment tokens and (c) the phonological features of adjacent words; and
  
  output for concatenatively supplying said sample list to an analog conversion unit to produce an audible text-to-speech signal.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The reading system of claim 1 wherein the word list generator is further operable to add numeric placeholder words corresponding to integers in said input string.
  - 3. The reading system of claim 1 wherein said set of speech samples includes a speech sample entry for each of said plurality of intonation types.
  - 4. The reading system of claim 1 wherein said word list generator builds said word list as ordered pairs, each pair comprising a word token and a prosodic environment token.
  - 5. The reading system of claim 1 wherein said phonological feature analyzer examines at least the word preceding an entry in the word list to determine the phonological features of adjacent words.
  - 6. The reading system of claim 1 wherein said phonological feature analyzer examines at least the word following an entry in the word list to determine the phonological features of adjacent words.

7. A method of text-to-speech conversion, comprising:
- receiving an input string representing text to be covered into audible synthesized speech;
  
  constructing a word list of word tokens corresponding to the input string by accessing a dictionary of complete word speech samples corresponding to entire words stored in a computer-readable medium;
  
  supplementing said word list with prosodic environment tokens that represent different intonation types, such that at least some of the word tokens in said word list are associated with a corresponding prosodic environment token;
  
  analyzing the phonological attributes associated with the word tokens in said word list by examining the phonological features of adjacent words in said list;
  
  selecting complete word speech samples from said predetermined dictionary of complete word speech samples corresponding to entire words based on (a) said word list tokens, (b) said corresponding prosodic environment tokens, and (c) said phonological attributes; and
  
  building a sample of list said selected complete word speech samples and supplying said sample list for concatenative output to an analog conversion unit to produce an audible text-to-speech signal.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The method of claim 7 wherein the step of constructing said word list includes adding numeric placeholder words corresponding to integers in said input string.
  - 9. The method of claim 7 wherein said set of speech samples includes a speech sample entry for each of said different intonation types.
  - 10. The method of claim 7 wherein said step of building a word list comprises building said word list as ordered pairs, where each pair comprises a word token and a prosodic environment token.
  - 11. The method of claim 7 wherein said step of analyzing the phonological attributes comprises examining at least the word preceding an entry in the word list to determine the attribute based on phonological features of the preceding word.
  - 12. The method of claim 7 wherein said step of analyzing the phonological attributes comprises examining at least the word following an entry in the word list to determine the attribute based on phonological features of following word.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Kibre, Nicholas, Hata, Kazue
Primary Examiner(s)
Dorvil, Richemond

Application Number

US08/709,581
Time in Patent Office

904 Days
Field of Search

704/260, 704/258, 704/267, 704/257, 704/266, 704/275, 704/270
US Class Current

704/260
CPC Class Codes

G10L 13/04 Details of speech synthesis...

G10L 13/07 Concatenation rules

High quality concatenative reading system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

199 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

High quality concatenative reading system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

199 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links