Method and system for customizing voice translation of text to speech

US 7,483,832 B2
Filed: 12/10/2001
Issued: 01/27/2009
Est. Priority Date: 12/10/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A method, comprising:

receiving text content for translation to speech;

correlating the text content to textual phrases of multiple words;

converting each textual phrase into a corresponding string of phonemes;

retrieving a phoneme identifier that uniquely represents each phoneme in the string of phonemes;

concatenating each phoneme identifier of each phoneme in the string of phonemes to produce a sequence of phoneme identifiers with each phoneme identifier separated by a comma;

creating a corresponding sequence of phoneme identifiers for each string of phonemes that corresponds to each textual phase in the text content;

concatenating each sequence of phoneme identifiers and separating each sequence of phone identifiers by a semi-colon;

accessing a voice file storing recorded phrases in a speaker'"'"'s voice;

mapping each sequence of phoneme identifiers to a corresponding recorded phrase found in the speaker'"'"'s voice file;

retrieving the recorded phrase from the voice file that corresponds to each sequence of phoneme identifiers from the text content;

concatenating together the recorded phrases from the speaker'"'"'s voice file to form a sequence of the recorded phrases as a speech translation of the text content; and

outputting the speech translation as a translation of the text content to speech.

View all claims

12 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system of customizing voice translation of a text to speech includes digitally recording speech samples of a known speaker, correlating each of the speech samples with a standardized audio representation, and organizing the recorded speech samples and correlated audio representations into a collection. The collection of speech samples correlated with audio representations is saved as a single voice file and stored in a device capable of translating the text to speech. The voice file is applied to a translation of text to speech so that the translated speech is customized according to the applied voice file.

Citations

21 Claims

1. A method, comprising:
- receiving text content for translation to speech;
  
  correlating the text content to textual phrases of multiple words;
  
  converting each textual phrase into a corresponding string of phonemes;
  
  retrieving a phoneme identifier that uniquely represents each phoneme in the string of phonemes;
  
  concatenating each phoneme identifier of each phoneme in the string of phonemes to produce a sequence of phoneme identifiers with each phoneme identifier separated by a comma;
  
  creating a corresponding sequence of phoneme identifiers for each string of phonemes that corresponds to each textual phase in the text content;
  
  concatenating each sequence of phoneme identifiers and separating each sequence of phone identifiers by a semi-colon;
  
  accessing a voice file storing recorded phrases in a speaker'"'"'s voice;
  
  mapping each sequence of phoneme identifiers to a corresponding recorded phrase found in the speaker'"'"'s voice file;
  
  retrieving the recorded phrase from the voice file that corresponds to each sequence of phoneme identifiers from the text content;
  
  concatenating together the recorded phrases from the speaker'"'"'s voice file to form a sequence of the recorded phrases as a speech translation of the text content; and
  
  outputting the speech translation as a translation of the text content to speech.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the phoneme identifier uniquely represents a phone.
  - 3. The method of claim 1, wherein the phoneme identifier uniquely represents a biphone.
  - 4. The method of claim 1, wherein the phoneme identifier uniquely represents a triphone.
  - 5. The method of claim 1, wherein the text content comprises content received from a computer network.
  - 6. The method of claim 5, wherein the text content received from the computer network comprises an electronic mail message.
  - 7. The method of claim 1, wherein the text content comprises text received from a telecommunications system.
  - 8. The method of claim 1, further comprising selecting voice files when translating the text content to speech, wherein the translated speech is customized according to a selected voice file.

9. A text-to-speech translation voice customization system, comprising:
- means for receiving text content for translation to speech;
  
  means for correlating the text content to textual phrases of multiple words;
  
  means for converting each textual phrase into a corresponding string of phonemes;
  
  means for retrieving a phoneme identifier that uniquely represents each phoneme in the string of phonemes;
  
  means for concatenating each phoneme identifier of each phoneme in the string of phonemes to produce a sequence of phoneme identifiers with each phoneme identifier separated by a comma;
  
  means for creating a corresponding sequence of phoneme identifiers for each string of phonemes that corresponds to each textual phrase in the text content;
  
  means for concatenating each sequence of phoneme identifiers and separating each sequence of phone identifiers by a semi-colon;
  
  means for accessing a voice file storing recorded phrases in a speaker'"'"'s voice;
  
  means for mapping each sequence of phoneme identifiers to a corresponding recorded phrase in the speaker'"'"'s voice file;
  
  means for retrieving the recorded phrase from the voice file that corresponds to each sequence of phoneme identifiers;
  
  means for concatenating together the recorded phases from the speaker'"'"'s voice file to form a sequence of the recorded phrases as a speech translation of the text content; and
  
  means for outputting the speech translation as a translation of the text content to speech.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 10. The system of claim 9, wherein the recorded phrases comprise digitally recorded speech samples.
  - 11. The system of claim 9, wherein the recorded phrases comprise analog voice signals that are converted to digital samples and represent at least one of speech speed, emphasis, rhythm, pitch, pausing, and emotion of the speaker.
  - 12. The system of claim 9, further comprising means for accessing a subset of the voice file sufficient to cause the textual sequence to be translated to speech using the associated voice file.
  - 13. The system of claim 9, further comprising means for classifying the string of phonemes to standardized numbers.
  - 14. The system of claim 13, wherein a standardized number uniquely represents at least one of a phone, a phoneme, a biphone, and a triphone.
  - 15. The system of claim 9, further comprising means for applying a combination of different voice files to create a new voice file.
  - 16. The system of claim 9, further comprising means for receiving the text content as content from a computer network.
  - 17. The system of claim 16, wherein the text content comprises an electronic mail message.
  - 18. The system of claim 9, further comprising means for receiving the text content as text from a telecommunications system.
  - 19. The system of claim 9, further comprising means for selecting voice files when translating the text content to speech, wherein the translated speech is customized according to a selected voice file.

20. A storage medium on which is encoded instructions for performing a method of translating text to speech, the method comprising:
- receiving text content for translation to speech;
  
  correlating the text content to textual phrases of multiple words;
  
  converting each textual phrase into a corresponding string of phonemes;
  
  retrieving a phoneme identifier that uniquely represents each phoneme in the string of phonemes;
  
  concatenating each phoneme identifier of each phoneme in the string of phonemes to produce a sequence of phoneme identifiers with each phoneme identifier separated by a comma;
  
  creating a corresponding sequence of phoneme identifiers for each string of phonemes that corresponds to each textual phrase in the text content;
  
  concatenating each sequence of phoneme identifiers and separating each sequence of phone identifiers by a semi-colon;
  
  accessing a voice file storing recorded phrases in a speaker'"'"'s voice;
  
  mapping each sequence of phoneme identifiers to a corresponding recorded phrase in the speaker'"'"'s voice file;
  
  retrieving the recorded phrase from the voice file that corresponds to each sequence of phoneme identifiers;
  
  concatenating together the recorded phrases from the speaker'"'"'s voice file to form a sequence of the recorded phrases as a speech translation of the text content; and
  
  outputting the speech translation as a translation of the text content to speech.
- View Dependent Claims (21)
- - 21. The storage medium of claim 20, further comprising instructions for selecting voice files, such that the text content is translated using a selected voice file.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cerence Operating Company (Cerence Inc.)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Tischer, Steve
Primary Examiner(s)
Hudspeth, David R
Assistant Examiner(s)
Sked, Matthew J

Application Number

US10/012,946
Publication Number

US 20040111271A1
Time in Patent Office

2,605 Days
Field of Search

704/261, 704/266, 704/260, 704/258
US Class Current

704/260
CPC Class Codes

G10L 13/033 Voice editing, e.g. manipul...

Method and system for customizing voice translation of text to speech

First Claim

12 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for customizing voice translation of text to speech

First Claim

12 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links