Mobile systems and methods of supporting natural language human-machine interactions
First Claim
1. A mobile device for annotating objects using multi-modal natural language inputs, comprising:
- an interface configured to communicate with a location service to determine location information associated with an object accessible to the mobile device;
a message service configured to communicate the location information to a storage device configured to store the location information with the object;
one or more input devices configured to receive a multi-modal natural language input, wherein the multi-modal natural language input includes a natural language utterance that annotates the object;
a speech recognition engine configured to transcribe the natural language utterance into a textual annotation using a dynamic grammar;
an agent architecture configured to search the storage device with one or more semantic attributes extracted from the natural language utterance and retrieve the object from the storage device in response to the extracted semantic attributes matching metadata associated with the location information stored with the object in the storage device; and
a processing unit configured to label the object retrieved from the storage device with the textual annotation to post-process the object with the textual annotation, wherein the storage device is further configured to store the textual annotation with the annotated object to post-process the annotated object.
6 Assignments
0 Petitions
Accused Products
Abstract
A mobile system is provided that includes speech-based and non-speech-based interfaces for telematics applications. The mobile system identifies and uses context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for users that submit requests and/or commands in multiple domains. The invention creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command. The invention may organize domain specific behavior and information into agents, that are distributable or updateable over a wide area network.
1099 Citations
31 Claims
-
1. A mobile device for annotating objects using multi-modal natural language inputs, comprising:
-
an interface configured to communicate with a location service to determine location information associated with an object accessible to the mobile device; a message service configured to communicate the location information to a storage device configured to store the location information with the object; one or more input devices configured to receive a multi-modal natural language input, wherein the multi-modal natural language input includes a natural language utterance that annotates the object; a speech recognition engine configured to transcribe the natural language utterance into a textual annotation using a dynamic grammar; an agent architecture configured to search the storage device with one or more semantic attributes extracted from the natural language utterance and retrieve the object from the storage device in response to the extracted semantic attributes matching metadata associated with the location information stored with the object in the storage device; and a processing unit configured to label the object retrieved from the storage device with the textual annotation to post-process the object with the textual annotation, wherein the storage device is further configured to store the textual annotation with the annotated object to post-process the annotated object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A non-transitory computer-readable storage medium that stores computer-executable instructions for annotating objects using multi-modal natural language inputs, wherein executing the computer-executable instructions on one or more processors causes the one or more processors to:
-
receive a multi-modal natural language input at one or more input devices coupled to the one or more processors, wherein the multi-modal natural language input includes a natural language utterance that annotates an object accessible to the one or more processors; transcribe the natural language utterance into a textual annotation with a speech recognition engine coupled to the one or more processors, wherein the speech recognition engine uses a dynamic grammar to transcribe the natural language utterance into the textual annotation; communicate the textual annotation to a storage device with a message service coupled to the one or more processors, wherein the storage device stores the textual annotation with the annotated object; communicatively couple, in an agent architecture associated with the one or more processors, services associated with an agent manager, a system agent, a plurality of domain agents, and an agent library that includes one or more utilities that the system agent and the plurality of domain agents can use; use, by the agent architecture, the communicatively coupled services to search the storage device with one or more semantic attributes extracted from a subsequent natural language utterance; and use, by the agent architecture, the communicatively coupled services to retrieve the annotated object from the storage device in response to the extracted semantic attributes matching metadata stored with the textual annotation in the storage device.
-
-
17. A mobile device for annotating objects using multi-modal natural language inputs, comprising:
-
one or more input devices configured to receive a multi-modal natural language input, wherein the multi-modal natural language input includes a natural language utterance that annotates an object accessible to the mobile device; a speech recognition engine configured to transcribe the natural language utterance into a textual annotation using a dynamic grammar; a message service configured to communicate the textual annotation to a storage device configured to store the textual annotation with the annotated object; and an agent architecture configured to; communicatively couple services associated with an agent manager, a system agent, a plurality of domain agents, and an agent library that includes one or more utilities that the system agent and the plurality of domain agents can use; use the communicatively coupled services to search the storage device with one or more semantic attributes extracted from a subsequent natural language utterances; and use the communicatively coupled services to retrieve the annotated object from the storage device in response to the extracted semantic attributes matching metadata stored with the textual annotation in the storage device.
-
-
18. A method for annotating objects using multi-modal natural language inputs, comprising:
-
receiving a multi-modal natural language input at one or more input devices coupled to a mobile device, wherein the multi-modal natural language input includes a natural language utterance that annotates an object accessible to the mobile device; transcribing the natural language utterance into a textual annotation with a speech recognition engine coupled to the mobile device, wherein the speech recognition engine uses a dynamic grammar to transcribe the natural language utterance into the textual annotation; communicating the textual annotation to a storage device with a message service coupled to the mobile device, wherein the storage device stores the textual annotation with the annotated object; communicatively coupling, in an agent architecture associated with the mobile device, services associated with an agent manager, a system agent, a plurality of domain agents, and an agent library that includes one or more utilities that the system agent and the plurality of domain agents can use; using, by the agent architecture, the communicatively coupled services to search the storage device with one or more semantic attributes extracted from a subsequent natural language utterance; and using, by the agent architecture, the communicatively coupled services to retrieve the annotated object from the storage device in response to the extracted semantic attributes matching metadata stored with the textual annotation in the storage device.
-
-
19. A method for annotating objects using multi-modal natural language inputs, comprising:
-
communicating with a location service to determine location information associated with an object accessible to a mobile device; communicating the location information to a storage device configured to store the location information with the object, wherein a message service coupled to the mobile device communicates the location information to the storage device; receiving a multi-modal natural language input at one or more input devices coupled to a mobile device, wherein the multi-modal natural language input includes a natural language utterance that annotates the object; transcribing the natural language utterance into a textual annotation with a speech recognition engine coupled to the mobile device, wherein the speech recognition engine uses a dynamic grammar to transcribe the natural language utterance into the textual annotation; searching, by an agent architecture coupled to the mobile device, the storage device with one or more semantic attributes extracted from the natural language utterance, wherein the agent architecture retrieves the object from the storage device in response to the extracted semantic attributes matching metadata associated with the location information stored with the object in the storage device; and labeling the object retrieved from the storage device with the textual annotation to post-process the object with the textual annotation, wherein the storage device further stores the textual annotation with the annotated object to post-process the annotated object. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A system for annotating objects using multi-modal natural language inputs, comprising:
-
a storage device configured to store an object accessible to an electronic device; one or more input devices configured to receive a multi-modal natural language input, wherein the multi-modal natural language input includes a natural language utterance that annotates the object; a speech recognition engine configured to transcribe the natural language utterance into a textual annotation using a dynamic grammar; a message service configured to communicate the textual annotation to the storage device, wherein the storage device is further configured to store the textual annotation with the annotated object; and an agent architecture configured to; communicatively couple services associated with an agent manager, a system agent, a plurality of domain agents, and an agent library that includes one or more utilities that the system agent and the plurality of domain agents can use; use the communicatively coupled services to search the storage device with one or more semantic attributes extracted from a subsequent natural language utterances; and use the communicatively coupled services to retrieve the annotated object from the storage device in response to the extracted semantic attributes matching metadata stored with the textual annotation in the storage device.
-
-
30. A system for annotating objects using multi-modal natural language inputs, comprising:
-
a storage device configured to store an object accessible to an electronic device; an interface configured to communicate with a location service to determine location information associated with the object; a message service configured to communicate the location information to the storage device, wherein the storage device is further configured to store the location information with the object; one or more input devices configured to receive a multi-modal natural language input, wherein the multi-modal natural language input includes a natural language utterance that annotates the object; a speech recognition engine configured to transcribe the natural language utterance into a textual annotation using a dynamic grammar; an agent architecture configured to search the storage device with one or more semantic attributes extracted from the natural language utterance and retrieve the object from the storage device in response to the extracted semantic attributes matching metadata associated with the location information stored with the object in the storage device; and a processing unit configured to label the object retrieved from the storage device with the textual annotation to post-process the object with the textual annotation, wherein the storage device is further configured to store the textual annotation with the annotated object to post-process the annotated object.
-
-
31. A non-transitory computer-readable storage medium that stores computer-executable instructions for annotating objects using multi-modal natural language inputs, wherein executing the computer-executable instructions on one or more processors causes the one or more processors to:
-
communicate with a location service to determine location information associated with an object accessible to the one or more processors; communicate the location information to a storage device configured to store the location information with the object, wherein a message service coupled to the one or more processors communicates the location information to the storage device; receive a multi-modal natural language input at one or more input devices coupled to the one or more processors, wherein the multi-modal natural language input includes a natural language utterance that annotates the object; transcribe, the natural language utterance into a textual annotation with a speech recognition engine coupled to the one or more processors, wherein the speech recognition engine uses a dynamic grammar to transcribe the natural language utterance into the textual annotation; search, by an agent architecture coupled to the one or more processors, the storage device with one or more semantic attributes extracted from the natural language utterance, wherein the agent architecture retrieves the object from the storage device in response to the extracted semantic attributes matching metadata associated with the location information stored with the object in the storage device; and label the object retrieved from the storage device with a processing unit coupled to the one or more processors, wherein the processing unit labels the object with the textual annotation to post-process the object with the textual annotation, and wherein the storage device is further configured to store the textual annotation with the annotated object to post-process the annotated object.
-
Specification