Spoken translation system using meta information strings

US 8,032,356 B2
Filed: 05/25/2007
Issued: 10/04/2011
Est. Priority Date: 05/25/2006
Status: Active Grant

First Claim

Patent Images

1. A system, comprising:

a speech receiving part, receiving a segment of speech signal in a source language to be processed;

a computer part, operating to process the segment of speech signal comprising;

processing in a first information channel the segment of speech signal in the source language using a statistical machine translation training, comprising;

recognizing speech in the processed segment of speech signal in the source language,converting the recognized speech into text in the source language, andconverting the text in the source language into a lattice in a target language;

processing in a second information channel the segment of speech signal in the source language using an information transfer training, the second information channel independent and separate from the first information channel, the processing in the second information channel comprising;

extracting, from the segment of speech signal, meta information associated with the recognized speech, wherein the meta information includes at least one non-textual aspect of the recognized speech,obtaining descriptors in the source language from the meta information that includes at least one non-textual aspect, andtransforming the obtained descriptors in the source language into descriptors in the target language; and

an output part producing an output in the target language comprising combining the lattice in the target language and the obtained descriptors in the second language using lattice rescoring.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Spoken translation system which detects both speech from the information and also detects meta information streams from the information. A first aspect produces an enriched training corpus of information for use in the machine translation. A second aspect uses two different extraction techniques, and combines them by lattice rescoring.

74 Citations

View as Search Results

15 Claims

1. A system, comprising:
- a speech receiving part, receiving a segment of speech signal in a source language to be processed;
  
  a computer part, operating to process the segment of speech signal comprising;
  
  processing in a first information channel the segment of speech signal in the source language using a statistical machine translation training, comprising;
  
  recognizing speech in the processed segment of speech signal in the source language,converting the recognized speech into text in the source language, andconverting the text in the source language into a lattice in a target language;
  
  processing in a second information channel the segment of speech signal in the source language using an information transfer training, the second information channel independent and separate from the first information channel, the processing in the second information channel comprising;
  
  extracting, from the segment of speech signal, meta information associated with the recognized speech, wherein the meta information includes at least one non-textual aspect of the recognized speech,obtaining descriptors in the source language from the meta information that includes at least one non-textual aspect, andtransforming the obtained descriptors in the source language into descriptors in the target language; and
  
  an output part producing an output in the target language comprising combining the lattice in the target language and the obtained descriptors in the second language using lattice rescoring.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. A system as in claim 1, wherein said computer part includes a training database, used to process said segment of speech.
  - 3. A system as in claim 2, wherein said training database comprises a first training part for said text to text statistical machine translation training, and a second training part that includes information about said non-textual aspect.
  - 4. A system as in claim 1, wherein said non-textual aspect includes keywords.
  - 5. A system as in claim 1, wherein said non-textual aspect includes prominence information.
  - 6. A system as in claim 1, wherein said non-textual aspect includes words which indicate emotions in the spoken speech.
  - 7. A system as in claim 1, wherein said output part is an audio producing element.
  - 8. A system as in claim 1, wherein said output part is a part that shows text.

9. A computer-implemented method, comprising:
- processing in a first information channel, at a computer comprising a processor, a segment of speech signal in a source language using a statistical machine translation training, the processing in the first information channel comprising;
  
  recognizing speech in the processed segment of speech signal in the source language,converting the recognized speech into text in the source language, andconverting the text in the source language into a lattice in a target language;
  
  processing, at the computer, the segment of speech signal in the source language using an information transfer training in a second information channel independent and separate from the first information channel, the processing in the second information channel comprising;
  
  extracting, from the segment of speech signal, meta information associated with the recognized speech, wherein the meta information includes at least one non-textual aspect of the recognized speech,obtaining descriptors in the source language from the meta information that includes at least one non-textual aspect, andtransforming the obtained descriptors in the source language into descriptors in the target language; and
  
  generating an output in the target language comprising combining the lattice in the target language and the descriptors in the target language using a lattice rescoring system.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. A computer-implemented method of claim 9, wherein the meta information is extracted from an input consisting of the segment of speech signal.
  - 11. A computer-implemented method of claim 9, wherein the text in the source language is retained in the first information channel.
  - 12. A computer-implemented method as in claim 9, wherein said non-textual aspect includes keywords.
  - 13. A computer-implemented method as in claim 9, wherein said non-textual aspect includes prominence information.
  - 14. A computer-implemented method as in claim 9, wherein said non-textual aspect includes words which indicate emotions in the spoken speech.
  - 15. A computer-implemented method as in claim 9, wherein said processing is carried out directly on received audio indicative of the speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
University of Southern California
Original Assignee
University of Southern California
Inventors
Wang, Dagen, Bulut, Murtaza, Narayanan, Shrikanth, Georgiou, Panayiotis
Primary Examiner(s)
ALBERTALLI, BRIAN LOUIS

Application Number

US11/754,148
Publication Number

US 20080065368A1
Time in Patent Office

1,593 Days
Field of Search

None
US Class Current

704/2
CPC Class Codes

G06F 40/284   Lexical analysis, e.g. toke...

G06F 40/44   Statistical methods, e.g. p...

G10L 15/26   Speech to text systems G10L...

Spoken translation system using meta information strings

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

74 Citations

15 Claims

Specification

Use Cases

Quick Links

Others

Spoken translation system using meta information strings

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

74 Citations

15 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others