Context-sensitive communication and translation methods for enhanced interactions and understanding among speakers of different languages

US 7,643,985 B2
Filed: 06/27/2005
Issued: 01/05/2010
Est. Priority Date: 06/27/2005
Status: Expired due to Fees

First Claim

Patent Images

1. A system that facilitates speech translation, comprising:

a processor;

a speech recognition component operated by the processor that processes sensed data of a current context and facilitates a speech recognition process based on the sensed data;

a historical activities component operated by the processor that stores historical data associated with the speech recognition process;

a language model operated by the processor that is created to recognize speech of the user and which is updated based on interactions with a user and on responses of a foreign language speakers, wherein the foreign language speaker is a person being addressed and provides the responses via a backchannel between the system and a device of the foreign language speaker; and

a language opportunity component operated by the processor that improves the speech recognition process by pushing a training session of one or more terms to the user which increases likelihood of success when using the one or more terms during the speech recognition process.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Architecture that interacts with a user, or users of different tongues to enhance speech translation. A recognized concept or situation is sensed and/or converged upon, and disambiguated with mixed-initiative user interaction with a device to provide simplified inferences about user communication goals in working with others who speak another language. Reasoning is applied about communication goals based on the concept or situation at the current focus of attention or the probability distribution over the likely focus of attention, and the user or user'"'"'s conversational partner is provided with appropriately triaged choices and, images, text and/or speech translations for review or perception. The inferences can also process an utterance or other input from a user as part of the evidence in reasoning about a concept, situation, goals, and/or disambiguating the latter. The system'"'"'s best understanding of the question, need, or intention at the crux of the communication can be echoed back to the user for confirmation. Context-sensitive focusing of recognition and information gathering components can be provided based on the listening, and can employ words recognized from prior or current user utterances to further focus the inference.

Citations

15 Claims

1. A system that facilitates speech translation, comprising:
- a processor;
  
  a speech recognition component operated by the processor that processes sensed data of a current context and facilitates a speech recognition process based on the sensed data;
  
  a historical activities component operated by the processor that stores historical data associated with the speech recognition process;
  
  a language model operated by the processor that is created to recognize speech of the user and which is updated based on interactions with a user and on responses of a foreign language speakers, wherein the foreign language speaker is a person being addressed and provides the responses via a backchannel between the system and a device of the foreign language speaker; and
  
  a language opportunity component operated by the processor that improves the speech recognition process by pushing a training session of one or more terms to the user which increases likelihood of success when using the one or more terms during the speech recognition process.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The system of claim 1, further comprising a sensing system that includes at least one of a microphone, an image capture subsystem, and a location subsystem, and outputs the sensed data representative thereof.
  - 3. The system of claim 1, further comprising an opportunity component that pushes a request to the user to resolve an ambiguity.
  - 4. The system of claim 1, further comprising a phonemic module that understands the user based on the language model.
  - 5. The system of claim 1, further comprising a simulator component utilized for training a user how to speak a foreign language.
  - 6. The system of claim 5, wherein the simulator component prompts the user in a user language and outputs corresponding foreign translation.
  - 7. The system of claim 1, wherein the historical activities component includes at least one of a central datastore of user data and a localized datastore of local user data.
  - 8. The system of claim 1, wherein the speech recognition component is automatic and adaptive.
  - 9. The system of claim 1, further comprising a machine learning and reasoning component that employs a probabilistic and/or statistical-based analysis to prognose or infer an action that a user desires to be automatically performed.

10. A method of facilitating the translation of speech between users of different tongues, comprising:
- receiving, by a computing device, speech signals of a user and other inputs during a speech recognition process;
  
  computing, by the computing device, an inference of a user context based on analysis of the speech signals and the other input, wherein the user context includes at least one of an image, location information, gesture information, and search information;
  
  modifying, by the computing device, the speech recognition process according to the inference;
  
  interacting, by the computing device, with the user to resolve ambiguous speech;
  
  presenting, by the computing device, translated speech to a foreign language speaker by sending the translated speech to a device of the foreign language speaker;
  
  modifying, by the computing device, the speech recognition process based on the interacting with the user and on a response from the foreign language speaker, wherein the foreign language speaker provides the response via a backchannel between the computing device and the device of the foreign language speaker; and
  
  updating, by the computing device, at least one of a user phonemic model and a user language model based on the act of interacting.
- View Dependent Claims (11, 12, 13)
- - 11. The method of claim 10, further comprising an act of generating at least one of the user phonemic model and the user language model based on the act of interacting.
  - 12. The method of claim 10, wherein the act of interacting includes an act of responding with user feedback via at least one of speech and mechanical interaction to resolve the ambiguous speech.
  - 13. The method of claim 10, further comprising an act of revising the inference by processing sensed input data in addition to the speech signals.

14. A system that facilitates communication between users of different tongues, comprising:
- a processor;
  
  means operable by the processor for receiving speech signals of at least one of a user and other input during a speech recognition process;
  
  means operable by the processor for computing an inference of a user context based on analysis of the speech signals and the other input, wherein the user context includes at least one of an image, location information, gesture information, and search information;
  
  means operable by the processor for interacting with the at least one of a user to resolve ambiguous speech;
  
  means operable by the processor for modifying the speech recognition process according the inference;
  
  means operable by the processor for presenting translated speech to a foreign language speaker by sending the translated speech to a device of the foreign language speaker;
  
  means operable by the processor for modifying the speech recognition process based on the interacting with the user and on a response from the foreign language speaker, wherein the foreign language speaker provides the response via a backchannel between the system and the device of the foreign language speaker; and
  
  means operable by the processor for updating at least one of a user phonemic model and a user language model based on the act of interacting.
- View Dependent Claims (15)
- - 15. The system of claim 14, further comprising means for storing at least one of user interaction data and foreign language speaker interaction data at a local distributed datastore.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Horvitz, Eric J.
Primary Examiner(s)
Hudspeth; David R
Assistant Examiner(s)
ALBERTALLI, BRIAN LOUIS

Application Number

US11/167,414
Publication Number

US 20060293893A1
Time in Patent Office

1,653 Days
Field of Search

None
US Class Current

704/2
CPC Class Codes

G06F 40/58   Use of machine translation,...

G10L 15/1822   Parsing for meaning underst...

G10L 2015/228   of application context

Context-sensitive communication and translation methods for enhanced interactions and understanding among speakers of different languages

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Context-sensitive communication and translation methods for enhanced interactions and understanding among speakers of different languages

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links