VOICE DIRECTED CONTEXT SENSITIVE VISUAL SEARCH

US 20160019240A1
Filed: 07/06/2015
Published: 01/21/2016
Est. Priority Date: 10/03/2011
Status: Abandoned Application

First Claim

Patent Images

1-20. -20. (canceled)

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various technologies described herein pertain to voice directed context sensitive visual searching. Visual content can be rendered on a display, and a voice directed query related to the visual content can be received. Contextual information related to the visual content can also be identified. Moreover, a search word recognized from the voice directed query and/or the contextual information can be used to detect an object from the visual content, where the object can be a part of the visual content. Further, a search can be performed using the object detected from the visual content, and a result of the search can be rendered on the display.

17 Citations

View as Search Results

40 Claims

1-20. -20. (canceled)

21. A method of searching, comprising:
- receiving a voice directed query related to visual content rendered on a display, wherein the visual content is one of a frame from a video stream, a two-dimensional image, or a three-dimensional image;
  
  detecting an object from the visual content based on a search word from the voice directed query, wherein;
  
  detecting the object from the visual content further comprises performing image processing on the visual content to identify an image of the object from the visual content based on the search word from the voice directed query;
  
  the image of the object is a portion of the visual content and the visual content comprises a remainder of the visual content other than the image of the object; and
  
  an edge of the image of the object is not delineated in the visual content prior to the performing of the image processing on the visual content;
  
  using the image of the object identified from the visual content as an input for a reverse visual search, wherein the reverse visual search is executed based upon the image of the object identified from the visual content, and wherein the reverse visual search returns a result; and
  
  rendering the result of the reverse visual search on the display.
- View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
- - 22. The method of claim 21, detecting the object from the visual content further based on contextual information.
  - 23. The method of claim 21, detecting the object from the visual content further based on at least one of an identity of a device used to render the visual content, an identity of a device used to create the visual content, or an identity of a user from whom the voice directed query is received.
  - 24. The method of claim 21, the object being a graphical image of a physical entity captured in the visual content.
  - 25. The method of claim 21, detecting the object from the visual content further comprises progressively refining the object detected from the visual content.
  - 26. The method of claim 21, further comprising:
    - using at least one of the search word or contextual information as an input for a disparate search, wherein the disparate search outputs a returned image; and
      
      detecting the object from the visual content further based on the returned image.
  - 27. The method of claim 21, the visual content being the frame from the video stream, the video stream being a broadcasted video stream, and the voice directed query being received while the broadcasted video stream is rendered on the display.
  - 28. The method of claim 27, further comprising:
    - receiving metadata pertaining to the broadcasted video stream; and
      
      detecting the object from the visual content further based on the metadata pertaining to the broadcasted video stream.
  - 29. The method of claim 21, the voice directed query comprises a natural language query.
  - 30. The method of claim 21, performing the image processing on the visual content further comprises using an edge detection algorithm corresponding to a type of the object being detected.
  - 31. The method of claim 21, further comprising:
    - while rendering the result of the reverse visual search on the display, receiving a command to access a next result of the reverse visual search; and
      
      responsive to receiving the command to access the next result of the reverse visual search, rendering the next result of the reverse visual search on the display.
  - 32. The method of claim 21, further comprising:
    - capturing the visual content using a camera of a device, wherein the device comprises the display; and
      
      rendering the visual content on the display of the device.
  - 33. The method of claim 32, further comprising:
    - identifying a geographic location of the device at which the visual content was captured;
      
      using the search word and the geographic location as an input for a disparate search, wherein the disparate search outputs a returned image; and
      
      detecting the object from the visual content further based on the returned image.

34. A device, comprising:
- a camera;
  
  a display;
  
  at least one processor; and
  
  memory that comprises computer-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform acts including;
  
  capturing visual content using the camera, wherein the visual content is one of a frame from a video stream, a two-dimensional image, or a three-dimensional image;
  
  rendering the visual content on the display;
  
  receiving a voice directed query related to the visual content rendered on the display;
  
  detecting an object from the visual content based on a search word from the voice directed query, wherein;
  
  detecting the object from the visual content further comprises performing image processing on the visual content to identify an image of the object from the visual content based on the search word from the voice directed query;
  
  the image of the object is a portion of the visual content and the visual content comprises a remainder of the visual content other than the image of the object; and
  
  an edge of the image of the object is not delineated in the visual content prior to the performing of the image processing on the visual content;
  
  using the image of the object identified from the visual content as an input for a reverse visual search, wherein the reverse visual search is executed based upon the image of the object identified from the visual content, and wherein the reverse visual search returns a result; and
  
  rendering the result of the reverse visual search on the display.
- View Dependent Claims (35, 36)
- - 35. The device of claim 34, the memory further comprising computer-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform acts including:
    - identifying a geographic location of the device at which the visual content was captured;
      
      using the search word and the geographic location as an input for a disparate search, wherein the disparate search outputs a returned image; and
      
      detecting the object from the visual content further based on the returned image.
  - 36. The device of claim 34, the memory further comprising computer-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform acts including:
    - detecting the object from the visual content further based on contextual information.

37. A system, comprising:
- at least one processor; and
  
  memory that comprises computer-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform acts including;
  
  rendering a video stream on a display;
  
  receiving a voice directed query related to the video stream rendered on the display;
  
  capturing a frame from the video stream in response to the voice directed query;
  
  detecting an object from the frame based on a search word from the voice directed query, wherein;
  
  detecting the object from the frame further comprises performing image processing on the frame to identify an image of the object from the frame based on the search word from the voice directed query;
  
  the image of the object is a portion of the frame and the frame comprises a remainder other than the image of the object; and
  
  an edge of the image of the object is not delineated in the frame prior to the performing of the image processing on the frame;
  
  using the image of the object identified from the frame as an input for a reverse visual search, wherein the reverse visual search is executed based upon the image of the object identified from the frame, and wherein the reverse visual search returns a result; and
  
  rendering the result of the reverse visual search on the display.
- View Dependent Claims (38, 39, 40)
- - 38. The system of claim 37, the memory further comprising computer-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform acts including:
    - using at least one of the search word or contextual information as an input for a disparate search, wherein the disparate search outputs a returned image; and
      
      detecting the object from the frame further based on the returned image.
  - 39. The system of claim 37, the memory further comprising computer-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform acts including:
    - receiving metadata pertaining to the video stream; and
      
      detecting the object from the frame further based on the metadata pertaining to the video stream.
  - 40. The system of claim 37, the memory further comprising computer-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform acts including:
    - rendering the result of the reverse visual search and the video stream concurrently on the display.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Hammontree, Monty Lee, Bapat, Vikram, Athans, Emmanuel John

Application Number

US14/791,536
Publication Number

US 20160019240A1
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 16/243   Natural language query form...

G06F 16/248   Presentation of query results

G06F 16/532   Query formulation, e.g. gra...

G06F 16/583   using metadata automaticall...

G06F 16/783   using metadata automaticall...

G06F 16/951   Indexing; Web crawling tech...

G06F 3/167   Audio in a user interface, ...

H04N 5/77   between a recording apparat...

VOICE DIRECTED CONTEXT SENSITIVE VISUAL SEARCH

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

17 Citations

40 Claims

Specification

Solutions

Use Cases

Quick Links

VOICE DIRECTED CONTEXT SENSITIVE VISUAL SEARCH

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

40 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links