System and method for enriching spoken language translation with prosodic information

US 8,571,849 B2
Filed: 09/30/2008
Issued: 10/29/2013
Est. Priority Date: 09/30/2008
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving speech for translation to a target language;

prior to a translation of the speech, generating, via a processor and via a discriminative classifier model, a pitch accent label based on the speech independent of volume, the pitch accent label having a regional accent type and representing segments of the speech which are prosodically prominent; and

injecting the pitch accent label with a word token within a translation engine to create target language output text.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, methods, and computer readable-media for enriching spoken language translation with prosodic information in a statistical speech translation framework. The method includes receiving speech for translation to a target language, generating pitch accent labels representing segments of the received speech which are prosodically prominent, and injecting pitch accent labels with word tokens within the translation engine to create enriched target language output text. A further step may be added of synthesizing speech in the target language based on the prosody enriched target language output text. An automatic prosody labeler can generate pitch accent labels. An automatic prosody labeler can exploit lexical, syntactic, and prosodic information of the speech. A maximum entropy model may be used to determine which segments of the speech are prosodically prominent. A pitch accent label can include an indication of certainty that a respective segment of the speech is prosodically prominent and/or an indication of prosodic prominence of a respective segment of speech.

Citations

17 Claims

1. A method comprising:
- receiving speech for translation to a target language;
  
  prior to a translation of the speech, generating, via a processor and via a discriminative classifier model, a pitch accent label based on the speech independent of volume, the pitch accent label having a regional accent type and representing segments of the speech which are prosodically prominent; and
  
  injecting the pitch accent label with a word token within a translation engine to create target language output text.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising synthesizing speech in the target language based on the target language output text.
  - 3. The method of claim 1, wherein an automatic prosody labeler generates the pitch accent label.
  - 4. The method of claim 1, wherein the discriminative classifier model determines which segments of the speech are prosodically prominent.
  - 5. The method of claim 1, wherein the pitch accent label comprises an indication of certainty that a respective segment of the speech is prosodically prominent.
  - 6. The method of claim 1, wherein the pitch accent label comprises an indication of prosodic prominence of a respective segment of speech.

7. A system comprising:
- a processor; and
  
  a computer-readable medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising;
  
  receiving speech for translation to a target language;
  
  prior to a translation of the speech, generating, via the processor and via a discriminative classifier model, a pitch accent label based on the speech independent of volume, the pitch accent label having a regional accent type and representing segments of the speech which are prosodically prominent; and
  
  injecting the pitch accent label with a word token within a translation engine to create target language output text.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The system of claim 7, the computer-readable medium having additional instructions stored which result in the operations further comprising synthesizing speech in the target language based on the target language output text.
  - 9. The system of claim 7, wherein an automatic prosody labeler generates the pitch accent label.
  - 10. The system of claim 7, wherein the discriminative classifier model determines which segments of the speech are prosodically prominent.
  - 11. The system of claim 7, wherein of the pitch accent label comprises an indication of certainty that a respective segment of the speech is prosodically prominent.
  - 12. The system of claim 7, wherein the pitch accent label comprises an indication of prosodic prominence of a respective segment of speech.

13. A computer-readable storage device having instructions stored which, when executed by a processor, cause the processor to perform operations comprising:
- receiving speech for translation to a target language;
  
  prior to a translation of the speech, generating, via the processor and via a discriminative classifier model, a pitch accent label based on the speech independent of volume, the pitch accent label having a regional accent type and representing segments of the speech which are prosodically prominent; and
  
  injecting the pitch accent label with a word token within a translation engine to create target language output text.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The computer-readable storage device of claim 13, the computer-readable storage device having additional instructions stored which result in the operations further comprising synthesizing speech in the target language based on the enriched target language output text.
  - 15. The computer-readable storage device of claim 13, wherein an automatic prosody labeler generates the pitch accent label.
  - 16. The computer-readable storage device of claim 13, wherein the discriminative classifier model determines which segments of the speech are prosodically prominent.
  - 17. The computer-readable storage device of claim 13, wherein the pitch accent label comprises an indication of certainty that a respective segment of the speech is prosodically prominent.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
RANGARAJAN SRIDHAR, Vivek Kumar, Bangalore, Srinivas
Primary Examiner(s)
COLUCCI, MICHAEL C

Application Number

US12/241,660
Publication Number

US 20100082326A1
Time in Patent Office

1,855 Days
Field of Search

704/258, 704/251, 704/2, 704/10, 704/200, 704/231, 704/235, 704/252, 704/260, 704/266, 704/270.1, 704/275, 704/277, 704/3, 709/250
US Class Current

704/3
CPC Class Codes

G06F 40/58 Use of machine translation,...

G10L 13/10 Prosody rules derived from ...

System and method for enriching spoken language translation with prosodic information

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for enriching spoken language translation with prosodic information

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links