Spoken Translation System Using Meta Information Strings

US 20080065368A1
Filed: 05/25/2007
Published: 03/13/2008
Est. Priority Date: 05/25/2006
Status: Active Grant

First Claim

Patent Images

1. A method, comprising:

processing a segment of speech to be recognized to recognize speech therein, and also to recognize meta information associated with the recognized speech, wherein the meta information includes at least one non-textual aspect of the recognized speech; and

producing an output that represents both the text recognized by said processing, and the at least one non-textual aspect.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Spoken translation system which detects both speech from the information and also detects meta information streams from the information. A first aspect produces an enriched training corpus of information for use in the machine translation. A second aspect uses two different extraction techniques, and combines them by lattice rescoring.

63 Citations

View as Search Results

23 Claims

1. A method, comprising:
- processing a segment of speech to be recognized to recognize speech therein, and also to recognize meta information associated with the recognized speech, wherein the meta information includes at least one non-textual aspect of the recognized speech; and
  
  producing an output that represents both the text recognized by said processing, and the at least one non-textual aspect.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. A method as in claim 1, wherein said processing comprises training a single statistical machine translation result based on both machine translation for text to text, as well as said non-textual aspect.
  - 3. A method as in claim 1, wherein said processing comprises using a first layer of analysis based on a first training corpus from said text to text training, and using a second layer of analysis based on a second training, to determine said non-textual aspect.
  - 4. A method as in claim 3, wherein said producing an output comprises obtaining text from said first layer of analysis, obtaining descriptors from said second layer of analysis, and combining said text and said descriptors.
  - 5. A method as in claim 4, wherein said combining comprises using a lattice rescoring system.
  - 6. A method as in claim 1, wherein said non-textual aspect includes keywords.
  - 7. A method as in claim 1, wherein said non-textual aspect includes prominence information.
  - 8. A method as in claim 1, wherein said non-textual aspect includes words which indicate emotions in the spoken speech.
  - 9. A method as in claim 1, wherein said processing is carried out directly on received audio indicative of the speech.

10. A system, comprising:
- a speech receiving part, receiving speech to be processed; and
  
  a computer part, operating to process a segment of speech to be recognized and to recognize speech therein, and also to recognize meta information associated with the recognized speech, wherein the meta information includes at least one non-textual aspect of the recognized speech, and producing an output indicative of the recognized speech and the meta information; and
  
  an output part, receiving said output from said computer part, and producing an output represents both the text recognized by said processing, and the at least one non-textual aspect.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 11. A system as in claim 10, wherein said computer part includes a training database, used to process said segment of speech.
  - 12. A system as in claim 11, wherein said training database comprises a single statistical machine translation result that allows machine translation for text to text, as well as said non-textual aspect.
  - 13. A system as in claim 11, wherein said training database comprises a first training part for said text to text training, and a second training that includes information about said non-textual aspect.
  - 14. A system as in claim 13, wherein said computer part produces said output by determining text from said first training part, determining descriptors from said second training part, and combining said text and said descriptors.
  - 15. A system as in claim 14, wherein said computer part uses a lattice rescoring system for combining said text and said descriptors.
  - 16. A system as in claim 10, wherein said non-textual aspect includes keywords.
  - 17. A system as in claim 10, wherein said non-textual aspect includes prominence information.
  - 18. A system as in claim 10, wherein said non-textual aspect includes words which indicate emotions in the spoken speech.
  - 19. A system as in claim 10, wherein said output part is an audio producing element.
  - 20. A system as in claim 10, wherein said output part is a part that shows text.

21. A method, comprising:
- processing an audio version indicative of a segment of speech to be recognized, to recognize speech therein, and also to recognize additional information associated with the recognized speech, wherein the additional information includes at least one of keywords, prominence information, and/or emotional information; and
  
  producing an output that represents both the text recognized by said processing, and the additional information.
- View Dependent Claims (22, 23)
- - 22. A method as in claim 21, wherein said processing comprises using a single statistical machine translation training database for both said recognize and said additional information.
  - 23. A method as in claim 21, wherein said processing comprises using a first training database for said text to text training, and a second database for said additional information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
University of Southern California
Original Assignee
University of Southern California
Inventors
Narayanan, Shrikanth, Georgiou, Panayiotis, Wang, Dagen, Bulut, Murtaza

Granted Patent

US 8,032,356 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/3
CPC Class Codes

G06F 40/284   Lexical analysis, e.g. toke...

G06F 40/44   Statistical methods, e.g. p...

G10L 15/26   Speech to text systems G10L...

Spoken Translation System Using Meta Information Strings

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

63 Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Spoken Translation System Using Meta Information Strings

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

63 Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links