System and method for an integrated, multi-modal, multi-device natural language voice services environment
First Claim
1. A method to provide an integrated, multi-modal, natural language voice services environment having an input device, a central device, and one or more secondary devices, wherein the method comprises:
- receiving, at the central device, a multi-modal natural language input from the input device, wherein the input device initially received the multi-modal natural language input;
maintaining, on the input device, the central device, and the one or more secondary devices, a constellation model that describes natural language resources, dynamic states, and intent determination capabilities associated with the input device, the central device, and the one or more secondary devices;
aggregating the natural language resources, the dynamic states, and the intent determination capabilities associated with the input device and the one or more secondary devices on the central device to converge the natural language resources, the dynamic states, and the intent determination capabilities held across the natural language voice services environment on the central device;
determining, on the central device, a preliminary intent associated with the multi-modal natural language input using the converged natural language resources, dynamic states, and intent determination capabilities held across the natural language voice services environment;
sending the multi-modal natural language input from the central device to the one or more secondary devices to invoke the intent determination capabilities associated with the one or more secondary devices;
collating, at the central device, intent determination responses received from the one or more secondary devices with the preliminary intent determined on the central device to generate an intent hypothesis associated with the multi-modal natural language input on the central device; and
returning the intent hypothesis associated with the multi-modal natural language input and information relating to one or more requests associated with the multi-modal natural language input to the input device, wherein the input device invokes one or more actions based on the returned intent hypothesis and the information relating to one or more requests associated with the multi-modal natural language input.
10 Assignments
0 Petitions
Accused Products
Abstract
A system and method for an integrated, multi-modal, multi-device natural language voice services environment may be provided. In particular, the environment may include a plurality of voice-enabled devices each having intent determination capabilities for processing multi-modal natural language inputs in addition to knowledge of the intent determination capabilities of other devices in the environment. Further, the environment may be arranged in a centralized manner, a distributed peer-to-peer manner, or various combinations thereof. As such, the various devices may cooperate to determine intent of multi-modal natural language inputs, and commands, queries, or other requests may be routed to one or more of the devices best suited to take action in response thereto.
682 Citations
24 Claims
-
1. A method to provide an integrated, multi-modal, natural language voice services environment having an input device, a central device, and one or more secondary devices, wherein the method comprises:
-
receiving, at the central device, a multi-modal natural language input from the input device, wherein the input device initially received the multi-modal natural language input; maintaining, on the input device, the central device, and the one or more secondary devices, a constellation model that describes natural language resources, dynamic states, and intent determination capabilities associated with the input device, the central device, and the one or more secondary devices; aggregating the natural language resources, the dynamic states, and the intent determination capabilities associated with the input device and the one or more secondary devices on the central device to converge the natural language resources, the dynamic states, and the intent determination capabilities held across the natural language voice services environment on the central device; determining, on the central device, a preliminary intent associated with the multi-modal natural language input using the converged natural language resources, dynamic states, and intent determination capabilities held across the natural language voice services environment; sending the multi-modal natural language input from the central device to the one or more secondary devices to invoke the intent determination capabilities associated with the one or more secondary devices; collating, at the central device, intent determination responses received from the one or more secondary devices with the preliminary intent determined on the central device to generate an intent hypothesis associated with the multi-modal natural language input on the central device; and returning the intent hypothesis associated with the multi-modal natural language input and information relating to one or more requests associated with the multi-modal natural language input to the input device, wherein the input device invokes one or more actions based on the returned intent hypothesis and the information relating to one or more requests associated with the multi-modal natural language input. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system to provide an integrated, multi-modal, natural language voice services environment having an input device, one or more secondary devices, and a central device configured to:
-
receive a multi-modal natural language input from the input device, wherein the input device initially received the multi-modal natural language input; maintain a constellation model and distribute the constellation model to the input device and the one or more secondary devices, wherein the constellation model describes natural language resources, dynamic states, and intent determination capabilities associated with the input device, the central device, and the one or more secondary devices; aggregate the natural language resources, the dynamic states, and the intent determination capabilities associated with the input device and the one or more secondary devices to converge the natural language resources, the dynamic states, and the intent determination capabilities held across the natural language voice services environment; use the converged natural language resources, dynamic states, and intent determination capabilities held across the natural language voice services environment to determine a preliminary intent associated with the multi-modal natural language input; send the multi-modal natural language input to the one or more secondary devices to invoke the intent determination capabilities associated with the one or more secondary devices; collate intent determination responses received from the one or more secondary devices with the determined preliminary intent to generate an intent hypothesis associated with the multi-modal natural language input on the central device; and return the intent hypothesis associated with the multi-modal natural language input and information relating to one or more requests associated with the multi-modal natural language input to the input device, wherein the input device is configured to invoke one or more actions based on the returned intent hypothesis and the information relating to one or more requests associated with the multi-modal natural language input. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification