Methods, systems, and products for translating text to speech

US 20060069567A1
Filed: 11/05/2005
Published: 03/30/2006
Est. Priority Date: 12/10/2001
Status: Abandoned Application

First Claim

Patent Images

1. A method of translating text to speech, comprising:

receiving content for translation to speech;

identifying a textual sequence in the content;

correlating the textual sequence to a phrase;

accessing a voice file storing multiple phrases, the voice file mapping each phrase to a corresponding sequential string of phonemes stored in the voice file;

retrieving the sequential string of phonemes corresponding to the phrase; and

processing the sequential string of phonemes when translating the textual sequence to speech.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and products are disclosed for translating text to speech. One such method receives content for translation to speech, identifies a textual sequence in the content, and correlates the textual sequence to a phrase. A voice file storing multiple phrases is accessed, with the voice file mapping each phrase to a corresponding sequential string of phonemes. The sequential string of phonemes, corresponding to the phrase, is retrieved and processed when translating the textual sequence to speech.

294 Citations

20 Claims

1. A method of translating text to speech, comprising:
- receiving content for translation to speech;
  
  identifying a textual sequence in the content;
  
  correlating the textual sequence to a phrase;
  
  accessing a voice file storing multiple phrases, the voice file mapping each phrase to a corresponding sequential string of phonemes stored in the voice file;
  
  retrieving the sequential string of phonemes corresponding to the phrase; and
  
  processing the sequential string of phonemes when translating the textual sequence to speech.

2. A method according to claim 1, further comprising receiving a tag that uniquely identifies the voice file of a speaker, such that the textual sequence is translated to speech using the speaker'"'"'s voice.

3. A method according to claim 1, further comprising correlating combined phrases to the textual sequence, such that at least two sequential strings of phonemes are combined and processed when translating the textual sequence to speech.

4. A method according to claim 1, further comprising combining at least two sequential strings of phonemes from different voice files of different speakers, with the at least two sequential strings of phonemes mapping to the same phrase, such that the textual sequence is translated into speech having attributes of each speaker'"'"'s voice.

5. A method according to claim 1, wherein the step of accessing the voice file comprises accessing a mean characteristic voice file and accessing a speaker'"'"'s delta voice file, the mean characteristic voice file containing common voice characteristics that are common to a population of speakers, and the speaker'"'"'s delta voice file containing unique voice characteristics that are unique to that speaker.

6. A method according to claim 1, further comprising:
- comparing a speaker'"'"'s unique voice characteristics stored in the voice file to actual speech to authenticate an identity of a sender of the content; and
  
  if the actual speech is unlike the unique voice characteristics stored in the voice file, then filtering the content.

7. A method according to claim 1, wherein the step of receiving the content comprises receiving the voice file that accompanies the content, the voice file comprising only those phonemes needed to translate the content to speech.

8. A system, comprising:
- a text-to-speech translation engine stored in storage; and
  
  a processor communicating with the storage, the text-to-speech translation application receiving content for translation to speech, identifying a textual sequence in the content, and correlating the textual sequence to a phrase;
  
  the text-to-speech translation application accessing a voice file storing multiple phrases, the voice file mapping each phrase to a corresponding sequential string of phonemes stored in the voice file;
  
  the text-to-speech translation application retrieving the sequential string of phonemes corresponding to the phrase and processing the sequential string of phonemes when translating the textual sequence to speech.

9. A system according to claim 8, the text-to-speech translation application further receiving a tag that uniquely identifies the voice file of a speaker, such that the textual sequence is translated to speech using the speaker'"'"'s voice.

10. A system according to claim 8, the text-to-speech translation application further correlating combined phrases to the textual sequence, such that at least two sequential strings of phonemes are combined and processed when translating the textual sequence to speech.

11. A system according to claim 8, the text-to-speech translation application further combining at least two sequential strings of phonemes from different voice files of different speakers, with the at least two sequential strings of phonemes mapping to the same phrase, such that the textual sequence is translated into speech having attributes of each speaker'"'"'s voice.

12. A system according to claim 8, wherein when the text-to-speech translation application accesses the voice file, the text-to-speech translation application accesses a mean characteristic voice file and accesses a speaker'"'"'s delta voice file, the mean characteristic voice file containing common voice characteristics that are common to a population of speakers, and the speaker'"'"'s delta voice file containing unique voice characteristics that are unique to that speaker.

13. A system according to claim 8, the text-to-speech translation application i) comparing a speaker'"'"'s unique voice characteristics stored in the voice file to actual speech to authenticate an identity of a sender of the content, and ii) if the actual speech is unlike the unique voice characteristics stored in the voice file, then filtering the content.

14. A system according to claim 8, wherein when the text-to-speech translation application receives the content, the voice file accompanies the content, the voice file comprising only those phonemes needed to translate the content to speech.

15. A computer program product comprising computer-readable instructions for performing the steps:
- receiving content for translation to speech;
  
  identifying a textual sequence in the content;
  
  correlating the textual sequence to a phrase;
  
  accessing a voice file storing multiple phrases, the voice file mapping each phrase to a corresponding sequential string of phonemes stored in the voice file;
  
  retrieving the sequential string of phonemes corresponding to the phrase; and
  
  processing the sequential string of phonemes when translating the textual sequence to speech.

16. A computer program product according to claim 15, further comprising instructions for receiving a tag that uniquely identifies the voice file of a speaker, such that the textual sequence is translated to speech using the speaker'"'"'s voice.

17. A computer program product according to claim 15, further comprising instructions for correlating combined phrases to the textual sequence, such that at least two sequential strings of phonemes are combined and processed when translating the textual sequence to speech.

18. A computer program product according to claim 15, further comprising instructions for combining at least two sequential strings of phonemes from different voice files of different speakers, with the at least two sequential strings of phonemes mapping to the same phrase, such that the textual sequence is translated into speech having attributes of each speaker'"'"'s voice.

19. A computer program product according to claim 15, further comprising instructions for:
- accessing a mean characteristic voice file and accessing a speaker'"'"'s delta voice file, the mean characteristic voice file containing common voice characteristics that are common to a population of speakers, and the speaker'"'"'s delta voice file containing unique voice characteristics that are unique to that speaker;
  
  comparing the unique voice characteristics stored in the speaker'"'"'s delta voice file to actual speech to authenticate an identity of a sender of the content; and
  
  if the actual speech is unlike the unique voice characteristics stored in the speaker'"'"'s delta voice file, then filtering the content.

20. A computer program product according to claim 15, wherein the instructions for receiving the content comprise instructions receiving the voice file that accompanies the content, the voice file comprising only those phonemes needed to translate the content to speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Koch, Robert A., Tischer, Steven N., Malik, Dale

Application Number

US11/267,092
Publication Number

US 20060069567A1
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/033 Voice editing, e.g. manipul...

Methods, systems, and products for translating text to speech

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

294 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Methods, systems, and products for translating text to speech

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

294 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links