Providing information services related to multimodal inputs
First Claim
Patent Images
1. A method comprising:
- receiving at a system server a first input comprising visual information from a client device, wherein the visual information comprises a plurality of character sequence groups;
receiving at the system server a second input comprising audio information from the client device, wherein the audio information comprises vocals;
extracting the plurality of character sequence groups from the visual information using an optical character recognition engine;
converting the vocals to text using a speech recognition engine;
generating a plurality of contexts wherein a context of the plurality of contexts comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text;
identifying a first context from the plurality of contexts based on the first input;
identifying a second context from the plurality of contexts based on the second input;
querying a database using the first and second contexts to generate a first list comprising at least one information service;
identifying a third context based on the first list of at least one information service;
querying the database using the third context to generate a second list of comprising at least one information service;
wherein the second list of at least one information service is to be displayed on a screen of the client device;
mapping the third context to an information service in the second list of information services, wherein the mapping the third context further comprises using rules stored at the system server; and
identifying the information service as a first type or a second type.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method provides information services related to multimodal inputs. Several different types of data used as multimodal inputs are described. Also described are various methods involving the generation of contexts using multimodal inputs, synthesizing context-information service mappings and identifying and providing information services.
72 Citations
20 Claims
-
1. A method comprising:
-
receiving at a system server a first input comprising visual information from a client device, wherein the visual information comprises a plurality of character sequence groups; receiving at the system server a second input comprising audio information from the client device, wherein the audio information comprises vocals; extracting the plurality of character sequence groups from the visual information using an optical character recognition engine; converting the vocals to text using a speech recognition engine; generating a plurality of contexts wherein a context of the plurality of contexts comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text; identifying a first context from the plurality of contexts based on the first input; identifying a second context from the plurality of contexts based on the second input; querying a database using the first and second contexts to generate a first list comprising at least one information service; identifying a third context based on the first list of at least one information service; querying the database using the third context to generate a second list of comprising at least one information service; wherein the second list of at least one information service is to be displayed on a screen of the client device; mapping the third context to an information service in the second list of information services, wherein the mapping the third context further comprises using rules stored at the system server; and identifying the information service as a first type or a second type. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer readable storage medium having instructions stored thereon that, when executed cause a machine to perform a method comprising:
-
receiving at a system server a first input comprising visual information from a client device, wherein the visual information comprises a plurality of character sequence groups; receiving at the system server a second input comprising audio information from the client device, wherein the audio information comprises vocals; extracting the plurality of character sequence groups from the visual information using an optical character recognition engine; converting the vocals to text using a speech recognition engine; generating a plurality of contexts wherein a first context of the plurality of contexts comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text; identifying a first context from the plurality of contexts based on the first input; identifying a second context from the plurality of contexts based on the second input; querying a database using the first and second contexts to generate a first list comprising at least one information service; identifying a third context based on the first list of at least one information service; querying the database using the third context to generate a second list of at least one information service; wherein the second list of at least one information service is to be displayed on a screen of the client device, wherein the third context is to be mapped to an information service in the second list of at least one information service, wherein the mapping the third context further comprises using rules stored at the system server; and wherein the information service is to be identified as a first type or a second type. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
-
-
15. A system comprising:
-
a processor; memory coupled with the processor; a communication interface to receive a first input comprising visual information and a second input comprising audio information from a client device, wherein the visual information comprises a plurality of character sequence groups, and wherein the audio information comprises vocals; an optical character recognition engine to extract the plurality of character sequence groups from the visual information; a speech recognition engine to convert the vocals to text; a context engine to generate a plurality of contexts, wherein a context of the plurality of contexts comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text, wherein a first context is identified from the plurality of contexts based on the first input and a second context is identified from the plurality of contexts based on the second input; wherein the communication interface is to transmit a first query to a database using the first and second contexts to generate a first list comprising at least one information service; wherein the context engine is to identify a third context based on the first list of at least one information service; wherein the communication interface is to transmit a second query to the database using the third context to generate a second list of at least one information service, wherein the second list of at least one information service is to be displayed on a screen of the client device; and wherein the context engine is to map the third context to a first information service in the second list of at least one information service, wherein mapping the third context comprises using rules stored on the system, and wherein the information service is identified as a first type or a second type. - View Dependent Claims (16, 17, 18, 19)
-
-
20. A mobile device comprising:
-
a processor; a communication interface to send a first input comprising visual information and a second input comprising audio information to a system server, wherein the visual information comprises a plurality of character sequence groups, wherein the audio information comprises vocals, wherein the plurality of character sequence groups is to be extracted from the visual information using an optical character recognition engine, and wherein the vocals are to be converted to text using a speech recognition engine; wherein the system server is to generate a plurality of contexts including a context comprising a first character sequence group from the plurality of character sequence groups and a first portion of the text, identify a first context from the plurality of contexts based on the first input, identify a second context from the plurality of contexts based on the second input, query a database using the first and second contexts to generate a first list comprising at least one information service, identify a third context based on the first list of at least one information service, query the database using the third context to generate a second list of at least one information service, wherein the third context is to be mapped to an information service in the second list of at least one information service, wherein the mapping the third context further comprises using rules stored at the system server, and wherein the information service is to be identified as a first type or a second type; and a display to display the second list of information services.
-
Specification