SYSTEM AND METHOD FOR ENRICHING SPOKEN LANGUAGE TRANSLATION WITH PROSODIC INFORMATION

US 20100082326A1
Filed: 09/30/2008
Published: 04/01/2010
Est. Priority Date: 09/30/2008
Status: Active Grant

First Claim

Patent Images

1. A method of enriching spoken language translation with prosodic information in a statistical speech translation framework, the method comprising:

receiving speech for translation to a target language;

generating pitch accent labels representing segments of the received speech which are prosodically prominent; and

injecting pitch accent labels with word tokens within a translation engine to create enriched target language output text.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, methods, and computer readable-media for enriching spoken language translation with prosodic information in a statistical speech translation framework. The method includes receiving speech for translation to a target language, generating pitch accent labels representing segments of the received speech which are prosodically prominent, and injecting pitch accent labels with word tokens within the translation engine to create enriched target language output text. A further step may be added of synthesizing speech in the target language based on the prosody enriched target language output text. An automatic prosody labeler can generate pitch accent labels. An automatic prosody labeler can exploit lexical, syntactic, and prosodic information of the speech. A maximum entropy model may be used to determine which segments of the speech are prosodically prominent. A pitch accent label can include an indication of certainty that a respective segment of the speech is prosodically prominent and/or an indication of prosodic prominence of a respective segment of speech.

67 Citations

View as Search Results

20 Claims

1. A method of enriching spoken language translation with prosodic information in a statistical speech translation framework, the method comprising:
- receiving speech for translation to a target language;
  
  generating pitch accent labels representing segments of the received speech which are prosodically prominent; and
  
  injecting pitch accent labels with word tokens within a translation engine to create enriched target language output text.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, the method further comprising synthesizing speech in the target language based on the prosody enriched target language output text.
  - 3. The method of claim 1, wherein an automatic prosody labeler generates the pitch accent labels.
  - 4. The method of claim 3, wherein the automatic prosody labeler exploits lexical, syntactic, and prosodic information of the speech.
  - 5. The method of claim 1, wherein a discriminative classifier model (such as a maximum entropy model) is used to determine which segments of the speech are prosodically prominent.
  - 6. The method of claim 1, wherein a pitch accent label includes an indication of certainty that a respective segment of the speech is prosodically prominent.
  - 7. The method of claim 1, wherein a pitch accent label includes an indication of prosodic prominence of a respective segment of speech.

8. A system for enriching spoken language translation with prosodic information in a statistical speech translation framework, the system comprising:
- a module configured to receive speech for translation to a target language;
  
  a module configured to generate pitch accent labels representing segments of the received speech which are prosodically prominent; and
  
  a module configured to inject pitch accent labels with word tokens within a translation engine to create enriched target language output text.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, the system further comprising synthesizing speech in the target language based on the prosody enriched target language output text.
  - 10. The system of claim 8, wherein an automatic prosody labeler generates the pitch accent labels.
  - 11. The system of claim 10, wherein the automatic prosody labeler exploits lexical, syntactic, and prosodic information of the speech.
  - 12. The system of claim 8, wherein a discriminative classifier model (such as a maximum entropy model) is used to determine which segments of the speech are prosodically prominent.
  - 13. The system of claim 8, wherein a pitch accent label includes an indication of certainty that a respective segment of the speech is prosodically prominent.
  - 14. The system of claim 8, wherein a pitch accent label includes an indication of prosodic prominence of a respective segment of speech.

15. A computer-readable medium storing a computer program having instruction for enriching spoken language translation with prosodic information in a statistical speech translation framework, the instructions comprising:
- receiving speech for translation to a target language;
  
  generating pitch accent labels representing segments of the received speech which are prosodically prominent; and
  
  injecting pitch accent labels with word tokens within a translation engine to create enriched target language output text.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The computer-readable medium of claim 15, the method further comprising synthesizing speech in the target language based on the prosody enriched target language output text.
  - 17. The computer-readable medium of claim 15, wherein an automatic prosody labeler generates the pitch accent labels.
  - 18. The computer-readable medium of claim 17, wherein the automatic prosody labeler exploits lexical, syntactic, and prosodic information of the speech.
  - 19. The computer-readable medium of claim 15, wherein a discriminative classifier model (such as a maximum entropy model) is used to determine which segments of the speech are prosodically prominent.
  - 20. The computer-readable medium of claim 15, wherein a pitch accent label includes an indication of certainty that a respective segment of the speech is prosodically prominent.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Sridhar, Vivek Kumar Rangarajan, BANGALORE, Srinivas

Granted Patent

US 8,571,849 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/3
CPC Class Codes

G06F 40/58 Use of machine translation,...

G10L 13/10 Prosody rules derived from ...

SYSTEM AND METHOD FOR ENRICHING SPOKEN LANGUAGE TRANSLATION WITH PROSODIC INFORMATION

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

67 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD FOR ENRICHING SPOKEN LANGUAGE TRANSLATION WITH PROSODIC INFORMATION

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

67 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links