×

Eye gaze for spoken language understanding in multi-modal conversational interactions

  • US 10,317,992 B2
  • Filed: 09/25/2014
  • Issued: 06/11/2019
  • Est. Priority Date: 09/25/2014
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • identifying a plurality of visual elements available for user interaction in a visual context on a display;

    receiving speech input including one or more words spoken by a user;

    extracting lexical features from the speech input;

    computing, for each visual element of the plurality of visual elements, a lexical similarity between the lexical features and the respective visual element of the plurality of visual elements and a lexical probability for each lexical similarity;

    receiving, from a tracking component, a gaze input;

    determining, from the gaze input, a heat map representing a probabilistic model of objects the user is looking at in the visual context on the display, the objects including the plurality of visual elements;

    determining that a particular visual element of the plurality of visual elements is an intended visual element of the speech input using a combination of a lexical probability of the lexical probabilities and the heat map;

    determining, by one or more processors, that the speech input comprises a command directed to the particular visual element; and

    causing an action associated with the particular visual element to be performed.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×