Providing Information Services Related to Multimodal Inputs

US 20110093264A1
Filed: 12/21/2010
Published: 04/21/2011
Est. Priority Date: 08/31/2004
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving at a system server a first input comprising visual information from a client device, wherein the visual information comprises a plurality of character sequence groups;

receiving at the system server a second input comprising audio information from the client device, wherein the audio information comprises vocals;

extracting the plurality of character sequence groups from the visual information using an optical character recognition engine;

converting the vocals to text using a speech recognition engine; and

generating a plurality of contexts wherein a first context comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method provides information services related to multimodal inputs. Several different types of data used as multimodal inputs are described. Also described are various methods involving the generation of contexts using multimodal inputs, synthesizing context-information service mappings and identifying and providing information services.

Citations

8 Claims

1. A method comprising:
- receiving at a system server a first input comprising visual information from a client device, wherein the visual information comprises a plurality of character sequence groups;
  
  receiving at the system server a second input comprising audio information from the client device, wherein the audio information comprises vocals;
  
  extracting the plurality of character sequence groups from the visual information using an optical character recognition engine;
  
  converting the vocals to text using a speech recognition engine; and
  
  generating a plurality of contexts wherein a first context comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 comprising:
    - identifying a first context from the plurality of contexts based on the first input;
      
      identifying a second context from the plurality of contexts based on the second input;
      
      querying a database using the first and second contexts to generate a first list comprising at least one information service; and
      
      identifying a third context based on the first list of at least one information service.
  - 3. The method of claim 2 comprising:
    - querying the database using the third context to generate a second list of information services;
      
      displaying the second list of information services on a screen of the client device;
      
      mapping the third context to a first information service in the list of information services, wherein the mapping the third context further comprises using rules stored at the system server; and
      
      identifying the first information service as a first type or a second type.
  - 4. The method of claim 1 comprising:
    - if a user accesses the first information service identified as a first type, debiting an account of the user; and
      
      if the user accesses the first information service identified as a second type, debiting an account of a provider of the information service.
  - 5. The method of claim 2 wherein the identifying a first context further comprises:
    - extracting at least one character sequence group from the visual information;
      
      comparing at least one character sequence group to a plurality of context constituents,wherein the plurality of context constituents are associated with the first context; and
      
      if the at least one character sequence group matches a context constituent in the plurality of context constituents, recording usage history of the first context.
  - 6. The method of claim 2 comprising:
    - storing the first and second input in the database;
      
      retrieving the first and second input in response to a query from the user;
      
      displaying the visual information on the screen of the client device; and
      
      playing the audio information through a speaker of the client device.
  - 7. The method of claim 1 comprising:
    - storing at the system server context information comprising a plurality of context constituents, wherein the identifying a first context based on the first input comprises associating a first context constituent to the first context; and
      
      the identifying a second context based on the second input comprises associating a second context constituent to the second context.

8. A method comprising:
- receiving at a system server a first input comprising visual information from a client device, wherein the visual information comprises a plurality of character sequence groups;
  
  receiving at the system server a second input comprising audio information from the client device, wherein the audio information comprises vocals;
  
  extracting the plurality of character sequence groups from the visual information using an optical character recognition engine;
  
  converting the vocals to text using a speech recognition engine;
  
  generating a plurality of contexts wherein a first context comprises a first character sequence group from the plurality of character sequence groups and a first portion of the text;
  
  identifying a first context from the plurality of contexts based on the first input;
  
  identifying a second context from the plurality of contexts based on the second input; and
  
  querying a database using the first and second contexts to generate a first list comprising at least one information service.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Tahoe Research Limited (f/k/a Learndale Limited) (Vector Capital Corporation)
Original Assignee
Intel Corporation
Inventors
Gopalakrishnan, Kumar

Granted Patent

US 8,370,323 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G06F 16/51   Indexing; Data structures t...

G06F 16/58   Retrieval characterised by ...

G06F 16/583   using metadata automaticall...

G06F 16/9032   Query formulation

G06F 18/256   of results relating to diff...

G06F 2203/0381   Multimodal input, i.e. inte...

G06F 3/038   Control and interface arran...

G06V 10/811   the classifiers operating o...

G06V 20/63   Scene text, e.g. street names

G06V 30/10   Character recognition

G06V 30/274   Syntactic or semantic conte...

Providing Information Services Related to Multimodal Inputs

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Providing Information Services Related to Multimodal Inputs

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links