Providing information services related to multimodal inputs
First Claim
Patent Images
1. A method comprising:
- receiving visual imagery at a system server from a client device, wherein the system server comprises a text recognition engine and the visual imagery comprises text;
extracting the text from the visual imagery using the text recognition engine;
generating, by the system server, a plurality of contexts based on the text, wherein the plurality of contexts comprises a word from the text;
ranking the plurality of contexts based on relevance to the extracted text;
querying, by the system server, a database using the ranked plurality of contexts as a parameter;
generating, by the system server, a list of search results based on the query using the ranked plurality of contexts, wherein one or more items from the generated list of search results are to be displayed on the client device as an augmented representation of the visual imagery on the client device, the augmented representation to include the visual imagery and a representation of the one or more items from the list integrated with the visual imagery;
displaying the augmented representation of the visual imagery in a first region of the screen; and
displaying the list of search results in a second region of the screen, wherein the first region of the screen has a first area size, the second region of the screen has a second area size, and the first area size is larger than the second area size.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method provides information services related to multimodal inputs. Several different types of data used as multimodal inputs are described. Also described are various methods involving the generation of contexts using multimodal inputs, synthesizing context-information service mappings and identifying and providing information services.
81 Citations
18 Claims
-
1. A method comprising:
-
receiving visual imagery at a system server from a client device, wherein the system server comprises a text recognition engine and the visual imagery comprises text; extracting the text from the visual imagery using the text recognition engine; generating, by the system server, a plurality of contexts based on the text, wherein the plurality of contexts comprises a word from the text; ranking the plurality of contexts based on relevance to the extracted text; querying, by the system server, a database using the ranked plurality of contexts as a parameter; generating, by the system server, a list of search results based on the query using the ranked plurality of contexts, wherein one or more items from the generated list of search results are to be displayed on the client device as an augmented representation of the visual imagery on the client device, the augmented representation to include the visual imagery and a representation of the one or more items from the list integrated with the visual imagery; displaying the augmented representation of the visual imagery in a first region of the screen; and displaying the list of search results in a second region of the screen, wherein the first region of the screen has a first area size, the second region of the screen has a second area size, and the first area size is larger than the second area size. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A non-transitory computer readable storage medium having instructions stored thereon that, when executed cause a machine to perform a method comprising:
-
receiving visual imagery from a client device, wherein the visual imagery comprises text; extracting the text from the visual imagery using the text recognition engine; generating a plurality of contexts based on the text, wherein the plurality of contexts comprises a word from the text; ranking the plurality of contexts based on relevance to the extracted text; querying a database using the ranked plurality of contexts; generating a list of search results based on the query using the ranked plurality of contexts, wherein one or more items from the generated list are to be displayed on the client device as an augmented representation of the visual imagery on the client device, the augmented representation to include the visual imagery and a representation of the one or more items from the list integrated with the visual imagery; displaying the augmented representation of the visual imagery in a first region of the screen; and displaying the list of search results in a second region of the screen, wherein the first region of the screen has a first area size, the second region of the screen has a second area size, and the first area size is larger than the second area size. - View Dependent Claims (15, 16, 17)
-
-
18. A system comprising:
-
a communication interface to receive visual imagery from a client device, wherein the visual imagery comprises text; an electronic memory to store the visual imagery; a text recognition engine to extract the text from the visual imagery; and a context engine to generate a plurality of contexts and rank the plurality of contexts based on relevance to the extracted text, wherein the plurality of contexts comprises a word from the text; wherein the communication interface is to transmit a first query to a database using the ranked plurality of contexts to generate a list of search results to be displayed on the client device as an augmented representation of the visual imagery on the client device, the augmented representation to include the visual imagery and a representation of the one or more items from the list integrated with the visual imagery; and a screen having a first region to display the augmented representation of the visual imagery and a second region to display the list of search results, wherein the first region of the screen has a first area size, the second region of the screen has a second area size, and the first area size is larger than the second area size.
-
Specification