Semantic interpretation using user gaze order
First Claim
1. A computer-implemented method for semantically interpreting speech recognition results using gaze tracking data, the method comprising:
- obtaining data identifying a sequence of gaze attention dwell positions and a gaze order associated with each gaze attention dwell position, wherein a gaze attention dwell position corresponds to a position on a user interface in which a duration of attention on a single element satisfies a minimum dwell time threshold;
after obtaining the data identifying the sequence of gaze attention dwell positions and the gaze order associated with each gaze attention dwell position, receiving audio data encoding an utterance;
obtaining a transcription of the utterance;
correlating the gaze attention dwell positions and the gaze order associated with each gaze attention dwell position with a semantic description of elements displayed on a visual display to identify one or more particular elements and a respective gaze order associated with each particular element;
performing semantic interpretation of at least one term included in the transcription based at least on the gaze order associated with the particular elements; and
outputting a result of performing the semantic interpretation of the at least one term.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, and systems, including computer programs encoded on computer-readable storage mediums, including a method for performing semantic interpretation using gaze order. The method includes obtaining data identifying a sequence of gaze attention dwell positions, obtaining a semantic description of elements displayed on a visual display, obtaining a transcription of an utterance, correlating the gaze attention dwell positions with the semantic description of elements to generate a sequence of one or more of the elements, performing semantic interpretation of at least one term included in the transcription based at least on the sequence of the elements, and outputting a result of performing the semantic interpretation of the at least one term.
251 Citations
20 Claims
-
1. A computer-implemented method for semantically interpreting speech recognition results using gaze tracking data, the method comprising:
-
obtaining data identifying a sequence of gaze attention dwell positions and a gaze order associated with each gaze attention dwell position, wherein a gaze attention dwell position corresponds to a position on a user interface in which a duration of attention on a single element satisfies a minimum dwell time threshold; after obtaining the data identifying the sequence of gaze attention dwell positions and the gaze order associated with each gaze attention dwell position, receiving audio data encoding an utterance; obtaining a transcription of the utterance; correlating the gaze attention dwell positions and the gaze order associated with each gaze attention dwell position with a semantic description of elements displayed on a visual display to identify one or more particular elements and a respective gaze order associated with each particular element; performing semantic interpretation of at least one term included in the transcription based at least on the gaze order associated with the particular elements; and outputting a result of performing the semantic interpretation of the at least one term. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system comprising:
-
one or more computers; and a computer readable storage medium storing computer software instructions executable by the one or more computers to perform operations comprising; obtaining data identifying a sequence of gaze attention dwell positions and a gaze order associated with each gaze attention dwell position; obtaining a transcription of an utterance; correlating the gaze attention dwell positions and the gaze order associated with each gaze attention dwell position with a semantic description of elements displayed on a visual display to identify one or more particular elements and a respective gaze order associated with each particular element; performing semantic interpretation of at least one term included in the transcription based at least on the gaze order associated with particular elements; and outputting a result of performing the semantic interpretation of the at least one term. - View Dependent Claims (17)
-
-
18. A computer-readable storage device storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
obtaining data identifying a sequence of gaze attention dwell positions and a gaze order associated with each gaze attention dwell position; obtaining a transcription of an utterance; correlating the gaze attention dwell positions and the gaze order associated with each gaze attention dwell position with a semantic description of elements displayed on a visual display to identify one or more particular elements and a respective gaze order associated with each particular element; performing semantic interpretation of at least one term included in the transcription based at least on the gaze order associated with particular elements; and outputting a result of performing the semantic interpretation of the at least one term. - View Dependent Claims (19, 20)
-
Specification