METHOD AND APPARATUS FOR PROVIDING SPEECH OUTPUT FOR SPEECH-ENABLED APPLICATIONS

US 20150106101A1
Filed: 12/16/2014
Published: 04/16/2015
Est. Priority Date: 02/12/2010
Status: Active Grant

First Claim

Patent Images

1-30. -30. (canceled)

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques for providing speech output for speech-enabled applications. A synthesis system receives from a speech-enabled application a text input including a text transcription of a desired speech output. The synthesis system selects one or more audio recordings corresponding to one or more portions of the text input. In one aspect, the synthesis system selects from audio recordings provided by a developer of the speech-enabled application. In another aspect, the synthesis system selects an audio recording of a speaker speaking a plurality of words. The synthesis system forms a speech output including the one or more selected audio recordings and provides the speech output for the speech-enabled application.

Citations

41 Claims

1-30. -30. (canceled)

31. A method for providing a speech output for a speech-enabled application, the method comprising:
- receiving from the speech-enabled application a text input comprising a text transcription of a desired speech output;
  
  selecting, using at least one computer system, an audio recording of a speaker speaking a plurality of words, the audio recording corresponding to at least a first portion of the text input; and
  
  providing for the speech-enabled application a speech output comprising the audio recording.
- View Dependent Claims (32, 33, 34, 35)
- - 32. The method of claim 31, wherein the audio recording is of the speaker reading at least a portion of a script, the at least a portion of the script corresponding exactly to the plurality of words, the plurality of words corresponding exactly to words of the at least the first portion of the text input.
  - 33. The method of claim 31, wherein the audio recording is stored in a single audio file.
  - 34. The method of claim 31, wherein the plurality of words were spoken consecutively by the speaker when forming the audio recording.
  - 35. The method of claim 31, wherein the audio recording comprises the plurality of words spoken naturally by the speaker.

36. A method for providing a speech output for a speech-enabled application, the method comprising:
- receiving at least one input specifying a desired speech output;
  
  selecting, using at least one computer system, at least one audio recording corresponding to at least a first portion of the desired speech output, the at least one audio recording being selected based at least in part on at least one constraint regarding a desired contrastive stress pattern in the desired speech output, the at least one constraint being indicated by metadata associated with the at least one audio recording; and
  
  providing for the speech-enabled application a speech output comprising the at least one audio recording.

37. At least one non-transitory computer-readable storage medium encoded with a plurality of computer-executable instructions that, when executed, perform a method for providing a speech output for a speech-enabled application, the method comprising:
- receiving from the speech-enabled application a text input comprising a text transcription of a desired speech output;
  
  selecting an audio recording of a speaker speaking a plurality of words, the audio recording corresponding to at least a first portion of the text input; and
  
  providing for the speech-enabled application a speech output comprising the audio recording.
- View Dependent Claims (38, 39, 40, 41)
- - 38. The at least one non-transitory computer-readable storage medium of claim 37, wherein the audio recording is of the speaker reading at least a portion of a script, the at least a portion of the script corresponding exactly to the plurality of words, the plurality of words corresponding exactly to words of the at least the first portion of the text input.
  - 39. The at least one non-transitory computer-readable storage medium of claim 37, wherein the audio recording is stored in a single audio file.
  - 40. The at least one non-transitory computer-readable storage medium of claim 37, wherein the plurality of words were spoken consecutively by the speaker when forming the audio recording.
  - 41. The at least one non-transitory computer-readable storage medium of claim 37, wherein the audio recording comprises the plurality of words spoken naturally by the speaker.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cerence Operating Company (Cerence Inc.)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Meyer, Darren C., Bos-Plachez, Corinne, Staessen, Martine Marguerite

Granted Patent

US 9,424,833 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/02   Methods for producing synth...

G10L 13/04   Details of speech synthesis...

G10L 13/08   Text analysis or generation...

METHOD AND APPARATUS FOR PROVIDING SPEECH OUTPUT FOR SPEECH-ENABLED APPLICATIONS

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

Citations

41 Claims

Specification

Solutions

Use Cases

Quick Links

METHOD AND APPARATUS FOR PROVIDING SPEECH OUTPUT FOR SPEECH-ENABLED APPLICATIONS

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

41 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links