Providing information services related to multimodal inputs

US 8,370,323 B2
Filed: 12/21/2010
Issued: 02/05/2013
Est. Priority Date: 08/31/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A method comprising:

receiving at a system server a first input comprising visual information from a client device, wherein the visual information comprises a plurality of character sequence groups;

receiving at the system server a second input comprising audio information from the client device, wherein the audio information comprises vocals;

extracting the plurality of character sequence groups from the visual information using an optical character recognition engine;

converting the vocals to text using a speech recognition engine;

generating a plurality of contexts wherein a context of the plurality of contexts comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text;

identifying a first context from the plurality of contexts based on the first input;

identifying a second context from the plurality of contexts based on the second input;

querying a database using the first and second contexts to generate a first list comprising at least one information service;

identifying a third context based on the first list of at least one information service;

querying the database using the third context to generate a second list of comprising at least one information service;

wherein the second list of at least one information service is to be displayed on a screen of the client device;

mapping the third context to an information service in the second list of information services, wherein the mapping the third context further comprises using rules stored at the system server; and

identifying the information service as a first type or a second type.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method provides information services related to multimodal inputs. Several different types of data used as multimodal inputs are described. Also described are various methods involving the generation of contexts using multimodal inputs, synthesizing context-information service mappings and identifying and providing information services.

72 Citations

View as Search Results

20 Claims

1. A method comprising:
- receiving at a system server a first input comprising visual information from a client device, wherein the visual information comprises a plurality of character sequence groups;
  
  receiving at the system server a second input comprising audio information from the client device, wherein the audio information comprises vocals;
  
  extracting the plurality of character sequence groups from the visual information using an optical character recognition engine;
  
  converting the vocals to text using a speech recognition engine;
  
  generating a plurality of contexts wherein a context of the plurality of contexts comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text;
  
  identifying a first context from the plurality of contexts based on the first input;
  
  identifying a second context from the plurality of contexts based on the second input;
  
  querying a database using the first and second contexts to generate a first list comprising at least one information service;
  
  identifying a third context based on the first list of at least one information service;
  
  querying the database using the third context to generate a second list of comprising at least one information service;
  
  wherein the second list of at least one information service is to be displayed on a screen of the client device;
  
  mapping the third context to an information service in the second list of information services, wherein the mapping the third context further comprises using rules stored at the system server; and
  
  identifying the information service as a first type or a second type.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 comprising:
    - in response to a user accessing the information service identified as the first type, debiting an account of the user; and
      
      in response to the user accessing the information service identified as the second type, debiting an account of a provider of the information service.
  - 3. The method of claim 1, wherein the identifying the first context further comprises:
    - extracting at least one character sequence group from the visual information;
      
      comparing at least one character sequence group to a plurality of context constituents, wherein the plurality of context constituents are associated with the first context; and
      
      if the at least one character sequence group matches a context constituent in the plurality of context constituents, recording usage history of the first context.
  - 4. The method of claim 1, further comprising:
    - storing the first and second input in the database; and
      
      retrieving the first and second input in response to a query from a user;
      
      wherein the visual information is to be displayed on the screen of the client device, and wherein the audio information is to be played through a speaker of the client device.
  - 5. The method of claim 1, further comprising:
    - storing at the system server context information comprising a plurality of context constituents, wherein the identifying the first context based on the first input comprises associating a first context constituent of the plurality of context constituents to the first context; and
      
      the identifying the second context based on the second input comprises associating a second context constituent of the plurality of context constituents to the second context.
  - 6. The method of claim 1, wherein identifying the third context is based further on user history of usage of one or more information services of the first list of at least one information service.

7. A computer readable storage medium having instructions stored thereon that, when executed cause a machine to perform a method comprising:
- receiving at a system server a first input comprising visual information from a client device, wherein the visual information comprises a plurality of character sequence groups;
  
  receiving at the system server a second input comprising audio information from the client device, wherein the audio information comprises vocals;
  
  extracting the plurality of character sequence groups from the visual information using an optical character recognition engine;
  
  converting the vocals to text using a speech recognition engine;
  
  generating a plurality of contexts wherein a first context of the plurality of contexts comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text;
  
  identifying a first context from the plurality of contexts based on the first input;
  
  identifying a second context from the plurality of contexts based on the second input;
  
  querying a database using the first and second contexts to generate a first list comprising at least one information service;
  
  identifying a third context based on the first list of at least one information service;
  
  querying the database using the third context to generate a second list of at least one information service;
  
  wherein the second list of at least one information service is to be displayed on a screen of the client device,wherein the third context is to be mapped to an information service in the second list of at least one information service, wherein the mapping the third context further comprises using rules stored at the system server; and
  
  wherein the information service is to be identified as a first type or a second type.
- View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
- - 8. The computer readable storage medium of claim 7, wherein the instructions cause the machine to perform the method further comprising:
    - in response to a user accessing the information service identified as the first type, debiting an account of the user; and
      
      in response to the user accessing the information service identified as the second type, debiting an account of a provider of the information service.
  - 9. The computer readable storage medium of claim 7, wherein the instructions cause the machine to perform the method further comprising:
    - extracting at least one character sequence group from the visual information;
      
      comparing at least one character sequence group to a plurality of context constituents, wherein the plurality of context constituents are associated with the first context; and
      
      if the at least one character sequence group matches a context constituent in the plurality of context constituents, recording usage history of the first context.
  - 10. The computer readable storage medium of claim 7, wherein the instructions cause the machine to perform the method further comprising:
    - storing the first and second input in the database; and
      
      retrieving the first and second input in response to a query from a user;
      
      wherein the visual information is to be displayed on the screen of the client device, and wherein the audio information is to be played through a speaker of the client device.
  - 11. The computer readable storage medium of claim 7, wherein the instructions cause the machine to perform the method further comprising:
    - storing at the system server context information comprising a plurality of context constituents, wherein the identifying the first context based on the first input comprises associating a first context constituent of the plurality of context constituents to the first context; and
      
      the identifying the second context based on the second input comprises associating a second context constituent of the plurality of context constituents to the second context.
  - 12. The computer readable storage medium of claim 7, wherein identifying the third context is based further on user history of usage of one or more information services of the first list of at least one information service.
  - 13. The computer readable storage medium of claim 7, wherein identifying the third context is based further on a capability of the client device.
  - 14. The computer readable storage medium of claim 7, wherein identifying the third context is based further on a user'"'"'s membership data for a membership group that limits access to an information service of the first list of at least one information service.

15. A system comprising:
- a processor;
  
  memory coupled with the processor;
  
  a communication interface to receive a first input comprising visual information and a second input comprising audio information from a client device, wherein the visual information comprises a plurality of character sequence groups, and wherein the audio information comprises vocals;
  
  an optical character recognition engine to extract the plurality of character sequence groups from the visual information;
  
  a speech recognition engine to convert the vocals to text;
  
  a context engine to generate a plurality of contexts, wherein a context of the plurality of contexts comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text, wherein a first context is identified from the plurality of contexts based on the first input and a second context is identified from the plurality of contexts based on the second input;
  
  wherein the communication interface is to transmit a first query to a database using the first and second contexts to generate a first list comprising at least one information service;
  
  wherein the context engine is to identify a third context based on the first list of at least one information service;
  
  wherein the communication interface is to transmit a second query to the database using the third context to generate a second list of at least one information service, wherein the second list of at least one information service is to be displayed on a screen of the client device; and
  
  wherein the context engine is to map the third context to a first information service in the second list of at least one information service, wherein mapping the third context comprises using rules stored on the system, and wherein the information service is identified as a first type or a second type.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The system of claim 15, wherein:
    - in response to a user accessing the information service identified as the first type, an account of the user is debited; and
      
      in response to the user accessing the information service identified as the second type, an account of a provider of the first information service is debited.
  - 17. The system of claim 15, wherein the context engine is to further:
    - extract at least one character sequence group from the visual information;
      
      compare at least one character sequence group to a plurality of context constituents, wherein the plurality of context constituents are associated with the first context; and
      
      if the at least one character sequence group matches a context constituent in the plurality of context constituents, record usage history of the first context.
  - 18. The system of claim 15, wherein the communication interface is to further:
    - transmit the first and second input for storage in the database; and
      
      retrieve the first and second input in response to a query from a user;
      
      wherein the visual information is to be displayed on the screen of the client device, and wherein the audio information is to be played through a speaker of the client device.
  - 19. The system of claim 15, further comprising:
    - a storage medium to store context information comprising a plurality of context constituents, wherein identifying the first context based on the first input comprises associating a first context constituent of the plurality of context constituents to the first context, and wherein identifying the second context based on the second input comprises associating a second context constituent of the plurality of context constituents to the second context.

20. A mobile device comprising:
- a processor;
  
  a communication interface to send a first input comprising visual information and a second input comprising audio information to a system server, wherein the visual information comprises a plurality of character sequence groups, wherein the audio information comprises vocals, wherein the plurality of character sequence groups is to be extracted from the visual information using an optical character recognition engine, and wherein the vocals are to be converted to text using a speech recognition engine;
  
  wherein the system server is to generate a plurality of contexts including a context comprising a first character sequence group from the plurality of character sequence groups and a first portion of the text, identify a first context from the plurality of contexts based on the first input, identify a second context from the plurality of contexts based on the second input, query a database using the first and second contexts to generate a first list comprising at least one information service, identify a third context based on the first list of at least one information service, query the database using the third context to generate a second list of at least one information service, wherein the third context is to be mapped to an information service in the second list of at least one information service, wherein the mapping the third context further comprises using rules stored at the system server, and wherein the information service is to be identified as a first type or a second type; and
  
  a display to display the second list of information services.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Tahoe Research Limited (f/k/a Learndale Limited) (Vector Capital Corporation)
Original Assignee
Intel Corporation
Inventors
Gopalakrishnan, Kumar
Primary Examiner(s)
Pham, Khanh
Assistant Examiner(s)
CHEEMA, AZAM M

Application Number

US12/975,000
Publication Number

US 20110093264A1
Time in Patent Office

777 Days
Field of Search

None
US Class Current

707/710
CPC Class Codes

G06F 16/51   Indexing; Data structures t...

G06F 16/58   Retrieval characterised by ...

G06F 16/583   using metadata automaticall...

G06F 16/9032   Query formulation

G06F 18/256   of results relating to diff...

G06F 2203/0381   Multimodal input, i.e. inte...

G06F 3/038   Control and interface arran...

G06V 10/811   the classifiers operating o...

G06V 20/63   Scene text, e.g. street names

G06V 30/10   Character recognition

G06V 30/274   Syntactic or semantic conte...

Providing information services related to multimodal inputs

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

72 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Providing information services related to multimodal inputs

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

72 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links