System and method for multimodal human-vehicle interaction and belief tracking
First Claim
Patent Images
1. A method for multimodal human-vehicle interaction, comprising:
- receiving input from an occupant in a vehicle via more than one mode, wherein the input includes a speech input and a gesture input;
performing multimodal recognition of the input to determine a reference to a point of interest based on the speech input and to extract a visual point of interest based on the gesture input and the reference to the point of interest in the speech input;
augmenting at least one recognition hypothesis based on the visual point of interest;
determining a belief state of the occupant'"'"'s intent, wherein the belief state is determined based on joint probability distribution tables of probabilistic ontology trees and the probabilistic ontology trees are based on the recognition hypothesis; and
selecting an action to take based on the determined belief state.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for multimodal human-vehicle interaction including receiving input from an occupant in a vehicle via more than one mode and performing multimodal recognition of the input. The method also includes augmenting at least one recognition hypothesis based on at least one visual point of interest and determining a belief state of the occupant'"'"'s intent based on the recognition hypothesis. The method further includes selecting an action to take based on the determined belief state.
18 Citations
12 Claims
-
1. A method for multimodal human-vehicle interaction, comprising:
-
receiving input from an occupant in a vehicle via more than one mode, wherein the input includes a speech input and a gesture input; performing multimodal recognition of the input to determine a reference to a point of interest based on the speech input and to extract a visual point of interest based on the gesture input and the reference to the point of interest in the speech input; augmenting at least one recognition hypothesis based on the visual point of interest; determining a belief state of the occupant'"'"'s intent, wherein the belief state is determined based on joint probability distribution tables of probabilistic ontology trees and the probabilistic ontology trees are based on the recognition hypothesis; and selecting an action to take based on the determined belief state. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for multimodal human-vehicle interaction, comprising:
-
receiving an input from an occupant of a vehicle including a first input and a second input, wherein the first and second inputs represent different modalities; performing multimodal recognition of the first and second inputs to determine a reference of a point of interest based on the first input and to extract a visual point of interest based on the second input and the reference to the point of interest in the first input; modifying a recognition hypothesis of the first input with the second input; determining a belief state of the occupant'"'"'s intent, wherein the belief state is determined based on joint probability distribution tables of probabilistic ontology trees and the probabilistic ontology trees are based on the recognition hypothesis; and selecting an action to take based on the determined belief state. - View Dependent Claims (7, 8, 9)
-
-
10. A system for multimodal human-vehicle interaction, comprising:
-
a plurality of sensors for sensing interaction data from a vehicle occupant, wherein the interaction data includes a speech input and a gesture input; a multimodal recognition module for performing multimodal recognition of the interaction data; a point of interest identification module for determining a reference to a point of interest based on the speech input and to extract a visual point of interest based on the gesture input and the reference to the point of interest in the speech input, wherein the multimodal recognition module augments a recognition hypothesis based on the visual point of interest, a belief tracking module for determining a belief state of the occupant'"'"'s intent, wherein the belief state is determined based on joint probability distribution tables of probabilistic ontology trees and the probabilistic ontology trees are based on the recognition hypothesis; and a dialog management and action module for selecting an action to take based on the determined belief state. - View Dependent Claims (11, 12)
-
Specification