Personalized gesture recognition for user interaction with assistant systems
First Claim
1. A method comprising, by one or more computing systems:
- accessing, from a data store, a plurality of input tuples associated with a first user, wherein each input tuple comprises a gesture-input and a corresponding speech-input;
determining, by a natural-language understanding (NLU) module, a plurality of intents corresponding to the plurality of speech-inputs, respectively;
generating, for the plurality of gesture-inputs, a plurality of feature representations based on one or more machine-learning models;
determining a plurality of gesture identifiers for the plurality of gesture-inputs, respectively, based on their respective feature representations;
associating the plurality of intents with the plurality of gesture identifiers, respectively; and
training, for the first user, a personalized gesture-classification model based on the plurality of feature representations of their respective gesture-inputs and the associations between the plurality of intents and their respective gesture identifiers.
3 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, a method includes accessing a plurality of input tuples associated with a first user from a data store, wherein each input tuple comprises a gesture-input and a corresponding speech-input, determining a plurality of intents corresponding to the plurality of speech-inputs, respectively, by a natural-language understanding (NLU) module, generating a plurality of feature representations for the plurality of gesture-inputs based on one or more machine-learning models, determining a plurality of gesture identifiers for the plurality of gesture-inputs, respectively, based on their respective feature representations, associating the plurality of intents with the plurality of gesture identifiers, respectively, and training a personalized gesture-classification model for the first user based on the plurality of feature representations of their respective gesture-inputs and the associations between the plurality of intents and their respective gesture identifiers.
65 Citations
20 Claims
-
1. A method comprising, by one or more computing systems:
-
accessing, from a data store, a plurality of input tuples associated with a first user, wherein each input tuple comprises a gesture-input and a corresponding speech-input; determining, by a natural-language understanding (NLU) module, a plurality of intents corresponding to the plurality of speech-inputs, respectively; generating, for the plurality of gesture-inputs, a plurality of feature representations based on one or more machine-learning models; determining a plurality of gesture identifiers for the plurality of gesture-inputs, respectively, based on their respective feature representations; associating the plurality of intents with the plurality of gesture identifiers, respectively; and training, for the first user, a personalized gesture-classification model based on the plurality of feature representations of their respective gesture-inputs and the associations between the plurality of intents and their respective gesture identifiers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
-
access, from a data store, a plurality of input tuples associated with a first user, wherein each input tuple comprises a gesture-input and a corresponding speech-input; determine, by a natural-language understanding (NLU) module, a plurality of intents corresponding to the plurality of speech-inputs, respectively; generate, for the plurality of gesture-inputs, a plurality of feature representations based on one or more machine-learning models; determine a plurality of gesture identifiers for the plurality of gesture-inputs, respectively, based on their respective feature representations; associate the plurality of intents with the plurality of gesture identifiers, respectively; and train, for the first user, a personalized gesture-classification model based on the plurality of feature representations of their respective gesture-inputs and the associations between the plurality of intents and their respective gesture identifiers. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A system comprising:
- one or more processors; and
a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to;access, from a data store, a plurality of input tuples associated with a first user, wherein each input tuple comprises a gesture-input and a corresponding speech-input; determine, by a natural-language understanding (NLU) module, a plurality of intents corresponding to the plurality of speech-inputs, respectively; generate, for the plurality of gesture-inputs, a plurality of feature representations based on one or more machine-learning models; determine a plurality of gesture identifiers for the plurality of gesture-inputs, respectively, based on their respective feature representations; associate the plurality of intents with the plurality of gesture identifiers, respectively; and train, for the first user, a personalized gesture-classification model based on the plurality of feature representations of their respective gesture-inputs and the associations between the plurality of intents and their respective gesture identifiers.
- one or more processors; and
Specification