INTUITIVE COMPUTING METHODS AND SYSTEMS
1 Assignment
0 Petitions
Accused Products
Abstract
A smart phone senses audio, imagery, and/or other stimulus from a user'"'"'s environment, and acts autonomously to fulfill inferred or anticipated user desires. In one aspect, the detailed technology concerns phone-based cognition of a scene viewed by the phone'"'"'s camera. The image processing tasks applied to the scene can be selected from among various alternatives by reference to resource costs, resource constraints, other stimulus information (e.g., audio), task substitutability, etc. The phone can apply more or less resources to an image processing task depending on how successfully the task is proceeding, or based on the user'"'"'s apparent interest in the task. In some arrangements, the phone is guided in various of its intuitive computing operations by user-spoken clues. A discovery session may be launched by the user speaking a cueing expression, which serves to switch the device from a lower activity state to a heightened alert state. Cognition, and identification of appropriate device response(s), can be aided by collateral information, such as context. A great number of other features and arrangements are also detailed.
56 Citations
93 Claims
-
1-82. -82. (canceled)
-
83. A method employing a device equipped with a processor, a display, a camera and a microphone, the camera capturing imagery depicting plural items in a user'"'"'s physical environment, the method comprising the acts:
-
capturing first speech of the user, with the device microphone; the device processor detecting that the captured first speech includes a cueing expression, and in response to detection of the cueing expression, the device switching from a lower activity state to a heightened alert state, in the heightened alert state the device performing functionality including; capturing second user speech with the device microphone; sending data corresponding to the second user speech to a recognition module, and receiving recognized second speech data in return, the recognized second user speech indicating one of said plural items depicted in the captured imagery as of particular user interest; based on one or more descriptors included in the recognized second speech data, determining a first of said plural depicted items as being of likely user interest; presenting a marking on the device display, at a location indicating said first item; capturing third user speech with the device microphone, the captured third user speech being different than the second user speech; sending data corresponding to the third user speech to the recognition module, and receiving recognized third speech data in return, the recognized third speech data again indicating one of said plural items as of particular user interest; based on one or more descriptors included in the recognized third speech data, determining that a second, different one of said plural depicted items is of greater interest to the user than the first item; moving said marking on the device display to a location indicating said second item; and taking an action based on the second item, said action including presenting information related to the second item to the user; wherein the device is not on heightened alert all the time, but is cued into activation from a lower activity state by the cueing expression, thereby bounding the device'"'"'s processing efforts, and the descriptors in the recognized second and third speech data iteratively guide the device in identifying which of the plural items in the user'"'"'s physical environment is of user interest, thereby further bounding the device'"'"'s processing efforts. - View Dependent Claims (84, 85, 86, 87, 88)
-
-
89. A method employing a device equipped with a processor, a camera and a microphone, the camera capturing imagery depicting plural items in a user'"'"'s physical environment, the method comprising the acts:
-
capturing first speech of the user, with the device microphone; detecting, with said device processor, that the captured first speech includes a cueing expression; in response to detection of the cueing expression, switching the device from a lower activity state to a heightened alert state, in the heightened alert state the device performing functionality including; capturing second speech of the user; sending data from the device, said data including data corresponding to the second user speech and data corresponding to the captured imagery, and receiving data, including (a) recognized second speech data and (b) recognition-processed data about a subject depicted in the imagery, in return; and taking an action based on said received data, including presenting information based on the recognition-processed data to the user; wherein the device is not on heightened alert all the time, but is cued into activation from a lower activity state by the cueing expression, thereby bounding the device'"'"'s processing efforts, and wherein, in its heightened alert state, the device cooperates with a remote computer system to recognition-process imagery captured by the device camera. - View Dependent Claims (90, 91, 92)
-
-
93. A tangible computer readable medium containing instructions to configure a device, equipped with a display, a camera and a microphone, to perform acts including:
-
capturing first speech of the user; detecting that the captured first speech includes a cueing expression, and in response to detection of the cueing expression, switching the device from a lower activity state to a heightened alert state, in the heightened alert state the instructions configuring the device to perform functions including; capturing second user speech; sending data corresponding to the second user speech to a recognition module, and receiving recognized second speech data in return; based on one or more descriptors included in the recognized second speech data, determining a first of said plural depicted items as being of likely user interest; presenting a marking on the display, at a location indicating said first item; capturing third user speech, the captured third user speech being different than the second user speech; sending data corresponding to the third user speech to the recognition module, and receiving recognized third speech data in return; based on one or more descriptors included in the recognized third speech data, determining that a second, different one of said plural depicted items is of greater interest to the user than the first item; moving said marking on the display to a location indicating said second item; and taking an action based on the second item, said action including presenting information related to the second item to the user; wherein the device is not on heightened alert all the time, but is cued by said instructions into activation from a lower activity state by the cueing expression, thereby bounding the device'"'"'s processing efforts, and the instructions enable descriptors in the recognized second and third speech data to iteratively guide the device in identifying which of the plural items in the user'"'"'s physical environment is of user interest, thereby further bounding the device'"'"'s processing efforts.
-
Specification