System and method for initiating multi-modal speech recognition using a long-touch gesture
First Claim
Patent Images
1. A method comprising:
- receiving a multi-modal input comprising speech and a single touch on a display, the single touch being at a single point;
identifying, based at least in part on a pronoun in the speech and based on the single touch, a first object;
identifying, based at least in part on the pronoun in the speech and based on the single touch, a second object; and
performing an action based on the speech and an association of the first object and the second object as identified by the single touch.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method and computer-readable storage devices are disclosed for multi-modal interactions with a system via a long-touch gesture on a touch-sensitive display. A system operating per this disclosure can receive a multi-modal input comprising speech and a touch on a display, wherein the speech comprises a pronoun. When the touch on the display has a duration longer than a threshold duration, the system can identify an object within a threshold distance of the touch, associate the object with the pronoun in the speech, to yield an association, and perform an action based on the speech and the association.
-
Citations
20 Claims
-
1. A method comprising:
-
receiving a multi-modal input comprising speech and a single touch on a display, the single touch being at a single point; identifying, based at least in part on a pronoun in the speech and based on the single touch, a first object; identifying, based at least in part on the pronoun in the speech and based on the single touch, a second object; and performing an action based on the speech and an association of the first object and the second object as identified by the single touch. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; receiving a multi-modal input comprising speech and a single touch on a display, the single touch being at a single point; identifying, based at least in part on a pronoun in the speech and based on the single touch, a first object; identifying, based at least in part on the pronoun in the speech and based on the single touch, a second object; and performing an action based on the speech and an association of the first object and the second object as identified by the single touch. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
receiving a multi-modal input comprising speech and a single touch on a display, the single touch being at a single point; identifying, based at least in part on a pronoun in the speech and based on the single touch, a first object; identifying, based at least in part on the pronoun in the speech and based on the single touch, a second object; and performing an action based on the speech and an association of the first object and the second object as identified by the single touch. - View Dependent Claims (20)
-
Specification