MOBILE SYSTEMS AND METHODS OF SUPPORTING NATURAL LANGUAGE HUMAN-MACHINE INTERACTIONS

US 20120278073A1
Filed: 06/04/2012
Published: 11/01/2012
Est. Priority Date: 08/29/2005
Status: Active Grant

First Claim

Patent Images

1-32. -32. (canceled)

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A mobile system is provided that includes speech-based and non-speech-based interfaces for telematics applications. The mobile system identifies and uses context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for users that submit requests and/or commands in multiple domains. The invention creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command. The invention may organize domain specific behavior and information into agents, that are distributable or updateable over a wide area network.

Citations

48 Claims

1-32. -32. (canceled)

33. A device for processing natural language inputs, comprising one or more processors configured to:
- receive a multi-modal natural language input from a user, the multi-modal natural language input including a natural language utterance and a non-speech input;
  
  generate a non-speech transcription from the non-speech input;
  
  identify the user who provided the multi-modal natural language input;
  
  generate a speech-based transcription based on a cognitive model associated with the user, wherein the cognitive model includes information on one or more prior interactions between the user and the device;
  
  generate a merged transcription from the speech-based transcription and the non-speech transcription;
  
  identify an entry in a context stack that matches information in the merged transcription;
  
  identify a domain agent associated with the entry in the context stack;
  
  determine a request based on the merged transcription; and
  
  communicate the request to the domain agent, wherein the domain agent is configured to generate a response to the user.
- View Dependent Claims (34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
- - 34. The device of claim 33, wherein the response includes an aggregation of content gathered when the identified domain agent processes the request.
  - 35. The device of claim 33, wherein the multi-modal natural language input is a first multi-modal natural language input and is received during a first session, the one or more processors further configured to receive a second multi-modal natural language input from another user during the first session.
  - 36. The device of claim 33, wherein the one or more processors are configured to support interactions with a plurality of users during an interleaved session.
  - 37. The device of claim 33, wherein the cognitive model is a first cognitive model, wherein the one or more processors are configured to generate the speech-based transcription based on a second cognitive model that includes information on one or more interactions between the device and a plurality of users of the device.
  - 38. The device of claim 37, wherein the one or more processors are configured to generate the speech-based transcription further based on an environmental model that has information indicative of an environment of the user.
  - 39. The device of claim 38, wherein the environmental model has information indicative of how noisy the environment of the user is.
  - 40. The device of claim 33, wherein the one or more processors are configured to receive one or more subsequent multi-modal natural language inputs, the follow-up multi-modal inputs including at least one of a follow-up natural language utterance or a follow-up non-speech input.
  - 41. The device of claim 33, wherein the identified domain agent is configured to update the context stack in response to processing the request.
  - 42. The device of claim 33, wherein the one or more processors are configured to synchronize the context stack with a context stack of another device.
  - 43. The device of claim 42, wherein the one or more processors are configured to synchronize the cognitive model with a cognitive model of another device.

44. A method for processing natural language inputs, comprising:
- receiving, by one or more processors, a multi-modal natural language input from a user, the multi-modal natural language input including a natural language utterance and a non-speech input;
  
  generating, by the one or more processors, a non-speech transcription from the non-speech input;
  
  identifying, by the one or more processors, the user who provided the multi-modal natural language input;
  
  generating, by the one or more processors, a speech-based transcription based on a cognitive model associated with the user, wherein the cognitive model includes information on one or more prior interactions between the user and the device;
  
  generating, by the one or more processors, a merged transcription from the speech-based transcription and the non-speech transcription;
  
  identifying, by the one or more processors, an entry in a context stack that matches information in the merged transcription;
  
  identifying, by the one or more processors, a domain agent associated with the entry in the context stack;
  
  determining, by the one or more processors, a request based on the merged transcription; and
  
  communicating, by the one or more processors, the request to the domain agent, wherein the domain agent is configured to generate a response to the user.
- View Dependent Claims (45)
- - 45. The method of claim 44, wherein the cognitive model is a first cognitive model, and wherein the generating the speech-based transcription is based on a second cognitive model that includes information on one or more interactions between the device and a plurality of users of the device.

46. A device for processing natural language inputs, comprising one or more processors configured to:
- receive a natural language utterance from a user;
  
  identify the user who provided the natural language utterance;
  
  generate a speech-based transcription based on a personal cognitive model associated with the user and a general cognitive model associated with the user, wherein the personal cognitive model includes information on one or more prior interactions between the device and the user, and wherein the general cognitive model includes information on one or more prior interactions between the device and a plurality of users;
  
  identify an entry indicative of a context of the natural language utterance;
  
  identify a domain agent associated with the entry in the stack;
  
  determine a request based on the speech-based transcription; and
  
  communicate the request to the domain agent, wherein the domain agent is configured to generate a response to the user.
- View Dependent Claims (47, 48)
- - 47. The device of claim 46, wherein the one or more processors are configured to identify the entry by accessing a context stack that has a plurality of entries that are each indicative of context and identifying one of the entries that matches information in the speech-based transcription.
  - 48. The device of claim 47, wherein the one or more processors are configured to generate the speech-based transcription further based on an environmental model, wherein the environmental model includes information indicative of how noisy the environment of the user is.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dialect, LLC
Original Assignee
VoiceBox Technologies, Inc. (Microsoft Corporation)
Inventors
Kennewick, Richard, Kennewick, Mike, Cristo, Philippe Di, Kennewick, Robert A., Menaker, Samuel, Armstrong, Lynn Elise, WEIDER, CHRIS

Granted Patent

US 8,447,607 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G06F 16/951   Indexing; Web crawling tech...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/22   Procedures used during a sp...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/227   of the speaker; Human-fact...

G10L 2015/228   of application context

G10L 21/06   Transformation of speech in...

H04M 2250/74   with voice recognition means

MOBILE SYSTEMS AND METHODS OF SUPPORTING NATURAL LANGUAGE HUMAN-MACHINE INTERACTIONS

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

48 Claims

Specification

Solutions

Use Cases

Quick Links

MOBILE SYSTEMS AND METHODS OF SUPPORTING NATURAL LANGUAGE HUMAN-MACHINE INTERACTIONS

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

48 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links