Providing information services related to multimodal inputs

US 9,639,633 B2
Filed: 10/09/2012
Issued: 05/02/2017
Est. Priority Date: 08/31/2004
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving visual imagery at a system server from a client device, wherein the system server comprises a text recognition engine and the visual imagery comprises text;

extracting the text from the visual imagery using the text recognition engine;

generating, by the system server, a plurality of contexts based on the text, wherein the plurality of contexts comprises a word from the text;

ranking the plurality of contexts based on relevance to the extracted text;

querying, by the system server, a database using the ranked plurality of contexts as a parameter;

generating, by the system server, a list of search results based on the query using the ranked plurality of contexts, wherein one or more items from the generated list of search results are to be displayed on the client device as an augmented representation of the visual imagery on the client device, the augmented representation to include the visual imagery and a representation of the one or more items from the list integrated with the visual imagery;

displaying the augmented representation of the visual imagery in a first region of the screen; and

displaying the list of search results in a second region of the screen, wherein the first region of the screen has a first area size, the second region of the screen has a second area size, and the first area size is larger than the second area size.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method provides information services related to multimodal inputs. Several different types of data used as multimodal inputs are described. Also described are various methods involving the generation of contexts using multimodal inputs, synthesizing context-information service mappings and identifying and providing information services.

81 Citations

18 Claims

1. A method comprising:
- receiving visual imagery at a system server from a client device, wherein the system server comprises a text recognition engine and the visual imagery comprises text;
  
  extracting the text from the visual imagery using the text recognition engine;
  
  generating, by the system server, a plurality of contexts based on the text, wherein the plurality of contexts comprises a word from the text;
  
  ranking the plurality of contexts based on relevance to the extracted text;
  
  querying, by the system server, a database using the ranked plurality of contexts as a parameter;
  
  generating, by the system server, a list of search results based on the query using the ranked plurality of contexts, wherein one or more items from the generated list of search results are to be displayed on the client device as an augmented representation of the visual imagery on the client device, the augmented representation to include the visual imagery and a representation of the one or more items from the list integrated with the visual imagery;
  
  displaying the augmented representation of the visual imagery in a first region of the screen; and
  
  displaying the list of search results in a second region of the screen, wherein the first region of the screen has a first area size, the second region of the screen has a second area size, and the first area size is larger than the second area size.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1, further comprising:
    - receiving, at the system server, audio information from the client device, the audio information comprising vocals; and
      
      converting the vocals to second text using a speech recognition engine;
      
      wherein one of the plurality of generated contexts comprises the second text from the received audio information.
  - 3. The method of claim 1, further comprising:
    - sending, by the system server, the plurality of ranked generated contexts to the client device for display on the client device; and
      
      receiving, by the system server, a selection of the plurality of ranked generated contexts from the client device;
      
      wherein querying the information service comprises querying a database using the selection of the plurality of ranked generated contexts to generate the list of textual search results comprising a list of information services based on the query.
  - 4. The method of claim 1, further comprising:
    - receiving, by the system server, one of a first or second user input from the client device, the first user input corresponding to a first category of information services, and the second user input corresponding to a second category of information services;
      
      in response to receiving the first user input, querying using the plurality of generated contexts, by the system server, a first database having information services of the first category to generate a first list of information services;
      
      in response to receiving the second user input, querying, using the plurality of generated contexts, by the system server, a second database having information services of the second category to generate a second list of information services; and
      
      sending, by the system server, the generated first or second list of information services to the client device to be overlaid on the visual imagery when the visual imagery is viewed by a user of the client device.
  - 5. The method of claim 1, wherein the augmented visual imagery comprises a first embedded link linking to a first information service and a second embedded link linking to a second information service.
  - 6. The method of claim 1, wherein the augmented representation of the visual imagery further comprises a graphical icon in the second region of the screen representing an information service.
  - 7. The method of claim 1, wherein the augmented representation of the visual imagery comprises an image of an item and the list of search results comprises a price for the item, a review of the item, and an identifier of a store selling the item.
  - 8. The method of claim 1, wherein the visual imagery received at the system server comprises an image of an item and the augmented representation of the visual imagery comprises a map comprising one or more icons representing stores selling the item.
  - 9. The method of claim 1 further comprising:
    - receiving, by the system server, a location of the client device obtained using a global positioning system (GPS) receiver component of the client device;
      
      wherein querying the database further comprises using the location of the client device as a parameter.
  - 10. The method of claim 1, further comprising:
    - extracting a plurality of context constituents associated with the visual imagery, including one of a font or color of the text;
      
      wherein querying the database further comprises using one of the plurality of context constituents as a parameter.
  - 11. The method of claim 1, wherein the generated list of search results comprises one or more information services, and wherein ranking the plurality of contexts is based further on a user'"'"'s usage history of the one or more information services.
  - 12. The method of claim 1, further comprising:
    - receiving, by the system server, a user-defined context constituent from the client device; and
      
      generating, by the system server, one of the plurality of contexts based further on the user-defined context constituent.
  - 13. The method of claim 12, further comprising:
    - receiving, by the system server, a selection of one or more of the generated plurality of contexts from the client device; and
      
      receiving, by the system server, media content from the client device, the media content associated with the selection of the one or more generated plurality of contexts;
      
      wherein the selection of the one or more generated plurality of contexts and the associated media content are to be available to other users querying the information service using the selection of the one or more generated plurality of contexts.

14. A non-transitory computer readable storage medium having instructions stored thereon that, when executed cause a machine to perform a method comprising:
- receiving visual imagery from a client device, wherein the visual imagery comprises text;
  
  extracting the text from the visual imagery using the text recognition engine;
  
  generating a plurality of contexts based on the text, wherein the plurality of contexts comprises a word from the text;
  
  ranking the plurality of contexts based on relevance to the extracted text;
  
  querying a database using the ranked plurality of contexts;
  
  generating a list of search results based on the query using the ranked plurality of contexts, wherein one or more items from the generated list are to be displayed on the client device as an augmented representation of the visual imagery on the client device, the augmented representation to include the visual imagery and a representation of the one or more items from the list integrated with the visual imagery;
  
  displaying the augmented representation of the visual imagery in a first region of the screen; and
  
  displaying the list of search results in a second region of the screen, wherein the first region of the screen has a first area size, the second region of the screen has a second area size, and the first area size is larger than the second area size.
- View Dependent Claims (15, 16, 17)
- - 15. The non-transitory computer readable storage medium of claim 14, the method further comprising:
    - receiving audio information from the client device, the audio information comprising speech; and
      
      converting the speech to second text using a speech recognition engine;
      
      wherein one of the plurality of generated contexts comprises the second text from the received audio information.
  - 16. The non-transitory computer readable storage medium of claim 14, the method further comprising:
    - sending the plurality of ranked generated contexts to the client device for display on the client device; and
      
      receiving a selection of the plurality of ranked generated contexts from the client device;
      
      wherein querying the database comprises using the selection of the plurality of ranked generated contexts as a parameter to generate the list of search results comprising a list of information services based on the query.
  - 17. The non-transitory computer readable storage medium of claim 14, the method further comprising:
    - receiving one of a first or second user input from the client device, the first user input corresponding to a first category of information services, and the second user input corresponding to a second category of information services;
      
      in response to receiving the first user input, querying, using the plurality of generated contexts as a parameter, a first database having information services of the first category to generate a first list of information services;
      
      in response to receiving the second user input, querying, using the plurality of generated contexts, a second database having information services of the second category to generate a second list of information services; and
      
      sending the generated first or second list of information services to the client device to be overlaid on the visual imagery when the visual imagery is viewed by a user of the client device.

18. A system comprising:
- a communication interface to receive visual imagery from a client device, wherein the visual imagery comprises text;
  
  an electronic memory to store the visual imagery;
  
  a text recognition engine to extract the text from the visual imagery; and
  
  a context engine to generate a plurality of contexts and rank the plurality of contexts based on relevance to the extracted text, wherein the plurality of contexts comprises a word from the text;
  
  wherein the communication interface is to transmit a first query to a database using the ranked plurality of contexts to generate a list of search results to be displayed on the client device as an augmented representation of the visual imagery on the client device, the augmented representation to include the visual imagery and a representation of the one or more items from the list integrated with the visual imagery; and
  
  a screen having a first region to display the augmented representation of the visual imagery and a second region to display the list of search results, wherein the first region of the screen has a first area size, the second region of the screen has a second area size, and the first area size is larger than the second area size.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Tahoe Research Limited (f/k/a Learndale Limited) (Vector Capital Corporation)
Original Assignee
Intel Corporation
Inventors
Gopalakrishnan, Kumar
Primary Examiner(s)
Cheema, Azam

Application Number

US13/648,206
Publication Number

US 20130057583A1
Time in Patent Office

1,666 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/51   Indexing; Data structures t...

G06F 16/58   Retrieval characterised by ...

G06F 16/583   using metadata automaticall...

G06F 16/9032   Query formulation

G06F 18/256   of results relating to diff...

G06F 2203/0381   Multimodal input, i.e. inte...

G06F 3/038   Control and interface arran...

G06V 10/811   the classifiers operating o...

G06V 20/63   Scene text, e.g. street names

G06V 30/10   Character recognition

G06V 30/274   Syntactic or semantic conte...

Providing information services related to multimodal inputs

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

81 Citations

18 Claims

Specification

Use Cases

Quick Links

Others

Providing information services related to multimodal inputs

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

81 Citations

18 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others