Context-sensitive communication and translation methods for enhanced interactions and understanding among speakers of different languages
First Claim
1. A system that facilitates speech translation, comprising:
- a processor;
a speech recognition component operated by the processor that processes sensed data of a current context and facilitates a speech recognition process based on the sensed data;
a historical activities component operated by the processor that stores historical data associated with the speech recognition process;
a language model operated by the processor that is created to recognize speech of the user and which is updated based on interactions with a user and on responses of a foreign language speakers, wherein the foreign language speaker is a person being addressed and provides the responses via a backchannel between the system and a device of the foreign language speaker; and
a language opportunity component operated by the processor that improves the speech recognition process by pushing a training session of one or more terms to the user which increases likelihood of success when using the one or more terms during the speech recognition process.
2 Assignments
0 Petitions
Accused Products
Abstract
Architecture that interacts with a user, or users of different tongues to enhance speech translation. A recognized concept or situation is sensed and/or converged upon, and disambiguated with mixed-initiative user interaction with a device to provide simplified inferences about user communication goals in working with others who speak another language. Reasoning is applied about communication goals based on the concept or situation at the current focus of attention or the probability distribution over the likely focus of attention, and the user or user'"'"'s conversational partner is provided with appropriately triaged choices and, images, text and/or speech translations for review or perception. The inferences can also process an utterance or other input from a user as part of the evidence in reasoning about a concept, situation, goals, and/or disambiguating the latter. The system'"'"'s best understanding of the question, need, or intention at the crux of the communication can be echoed back to the user for confirmation. Context-sensitive focusing of recognition and information gathering components can be provided based on the listening, and can employ words recognized from prior or current user utterances to further focus the inference.
-
Citations
15 Claims
-
1. A system that facilitates speech translation, comprising:
-
a processor; a speech recognition component operated by the processor that processes sensed data of a current context and facilitates a speech recognition process based on the sensed data; a historical activities component operated by the processor that stores historical data associated with the speech recognition process; a language model operated by the processor that is created to recognize speech of the user and which is updated based on interactions with a user and on responses of a foreign language speakers, wherein the foreign language speaker is a person being addressed and provides the responses via a backchannel between the system and a device of the foreign language speaker; and a language opportunity component operated by the processor that improves the speech recognition process by pushing a training session of one or more terms to the user which increases likelihood of success when using the one or more terms during the speech recognition process. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of facilitating the translation of speech between users of different tongues, comprising:
-
receiving, by a computing device, speech signals of a user and other inputs during a speech recognition process; computing, by the computing device, an inference of a user context based on analysis of the speech signals and the other input, wherein the user context includes at least one of an image, location information, gesture information, and search information; modifying, by the computing device, the speech recognition process according to the inference; interacting, by the computing device, with the user to resolve ambiguous speech; presenting, by the computing device, translated speech to a foreign language speaker by sending the translated speech to a device of the foreign language speaker; modifying, by the computing device, the speech recognition process based on the interacting with the user and on a response from the foreign language speaker, wherein the foreign language speaker provides the response via a backchannel between the computing device and the device of the foreign language speaker; and updating, by the computing device, at least one of a user phonemic model and a user language model based on the act of interacting. - View Dependent Claims (11, 12, 13)
-
-
14. A system that facilitates communication between users of different tongues, comprising:
-
a processor; means operable by the processor for receiving speech signals of at least one of a user and other input during a speech recognition process; means operable by the processor for computing an inference of a user context based on analysis of the speech signals and the other input, wherein the user context includes at least one of an image, location information, gesture information, and search information; means operable by the processor for interacting with the at least one of a user to resolve ambiguous speech; means operable by the processor for modifying the speech recognition process according the inference; means operable by the processor for presenting translated speech to a foreign language speaker by sending the translated speech to a device of the foreign language speaker; means operable by the processor for modifying the speech recognition process based on the interacting with the user and on a response from the foreign language speaker, wherein the foreign language speaker provides the response via a backchannel between the system and the device of the foreign language speaker; and means operable by the processor for updating at least one of a user phonemic model and a user language model based on the act of interacting. - View Dependent Claims (15)
-
Specification