Systems and methods for performing actions in response to user gestures in captured images

US 9,953,216 B2
Filed: 01/13/2015
Issued: 04/24/2018
Est. Priority Date: 01/13/2015
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented system comprising:

an image capture device that captures images;

a memory device that stores instructions; and

at least one processor that executes the instructions to perform operations comprising;

receiving, from the image capture device, at least one image including a gesture made by a user;

analyzing the at least one image to identify the gesture made by the user in the at least one image;

determining, based on the identified gesture, a first action to perform on the at least one image;

determining a selection area for the gesture;

identifying an area of interest in the at least one image based on the determined selection area of the gesture, wherein the area of interest includes non-textual content;

performing the first action on the identified area of interest, wherein the first action comprises;

classifying the non-textual content included in the area of interest into at least one of a plurality of different types of non-textual content into which the non-textual content is classifiable by the computer-implemented system, wherein the computer-implemented system is capable of recognizing each of a face, an object, and a landscape; and

generating a first result that indicates the at least one type of non-textual content into which the non-textual content included in the area of interest was classified;

determining a second action to be performed on the identified area of interest based at least in part on the at least one type of non-textual content into which the non-textual content included in the area of interest was classified; and

performing the second action on the identified area of interest.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems, methods, and computer-readable media are provided for performing actions in response to gestures made by a user in captured images. In accordance with one implementation, a computer-implemented system is provided that includes an image capture device that captures at least one image, a memory device that stores instructions, and at least one processor that executes the instructions stored in the memory device. In some implementations, the processor receives, from the image capture device, at least one image including a gesture made by a user and analyzes the at least one image to identify the gesture made by the user. In some implementations, the processor also determines, based on the identified gesture, one or more actions to perform on the at least one image.

Citations

20 Claims

1. A computer-implemented system comprising:
- an image capture device that captures images;
  
  a memory device that stores instructions; and
  
  at least one processor that executes the instructions to perform operations comprising;
  
  receiving, from the image capture device, at least one image including a gesture made by a user;
  
  analyzing the at least one image to identify the gesture made by the user in the at least one image;
  
  determining, based on the identified gesture, a first action to perform on the at least one image;
  
  determining a selection area for the gesture;
  
  identifying an area of interest in the at least one image based on the determined selection area of the gesture, wherein the area of interest includes non-textual content;
  
  performing the first action on the identified area of interest, wherein the first action comprises;
  
  classifying the non-textual content included in the area of interest into at least one of a plurality of different types of non-textual content into which the non-textual content is classifiable by the computer-implemented system, wherein the computer-implemented system is capable of recognizing each of a face, an object, and a landscape; and
  
  generating a first result that indicates the at least one type of non-textual content into which the non-textual content included in the area of interest was classified;
  
  determining a second action to be performed on the identified area of interest based at least in part on the at least one type of non-textual content into which the non-textual content included in the area of interest was classified; and
  
  performing the second action on the identified area of interest.
- View Dependent Claims (2, 3, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 2. The computer-implemented system according to claim 1, wherein the selection area for the gesture is determined by a shape and a location of the gesture in the at least one image.
  - 3. The computer-implemented system according to claim 1, wherein:
    - determining the second action to be performed on the identified area of interest comprises determining, based on the gesture and the first result, the second action to be performed on the identified area of interest.
  - 10. The computer-implemented system according to claim 1, wherein performing the second action comprises at least one of performing a facial recognition on a first face, identifying a first object, or determining a location.
  - 11. The computer-implemented system according to claim 1, wherein:
    - classifying content included in the area of interest into at least one of the plurality of different types of content comprises classifying the content included in the area of interest as a first face; and
      
      performing the second action comprises automatically performing facial recognition on the first face.
  - 12. The computer-implemented system according to claim 1, wherein:
    - determining the selection area for the gesture comprises determining the selection area based on a first image that includes the gesture made by the user; and
      
      identifying the area of interest in the at least one image comprises;
      
      obtaining a second image that depicts a same scene as the first image that includes the gesture made by the user, wherein the second image does not include the gesture made by the user; and
      
      identifying the area of interest in the second image based at least in part on the selection area determined from the first image.
  - 13. The computer-implemented system according to claim 1, wherein analyzing the at least one image to identify the gesture made by the user in the at least one image comprises:
    - obtaining depth of field measurements for the at least one image; and
      
      detecting the gesture in a foreground of the at least one image based at least in part on the depth of field measurements.
  - 14. The computer-implemented system according to claim 1, wherein analyzing the at least one image to identify the gesture made by the user in the at least one image comprises:
    - providing the at least one image to a deep neural network; and
      
      receiving an identifier for the gesture from the deep neural network.
  - 15. The computer-implemented system according to claim 1, wherein analyzing the at least one image to identify the gesture made by the user in the at least one image comprises:
    - providing the at least one image to a deep neural network; and
      
      receiving an output from the deep neural network that identifies one or more regions of the at least one image that contain the gesture.
  - 16. The computer-implemented system according to claim 1, wherein analyzing the at least one image to identify the gesture made by the user in the at least one image comprises:
    - providing the at least one image to a long short term memory neural network; and
      
      receiving an identifier for the gesture from the long short term memory neural network.
  - 17. The computer-implemented system according to claim 1, wherein determining the selection area for the gesture comprises:
    - obtaining depth data associated with the at least one image; and
      
      translating one or more boundaries in the at least one image into real world coordinates based at least in part on the depth data associated with the at least one image.
  - 18. The computer-implemented system according to claim 1, wherein:
    - analyzing the at least one image to identify the gesture comprises analyzing the at least one image to identify an O-shaped gesture formed by at least one finger and a thumb; and
      
      determining the selection area for the gesture comprises determining a center and a radius of a circular selection area based at least in part on the O-shaped gesture.
  - 19. The computer-implemented system according to claim 1, wherein:
    - performing the second action on the identified area of interest comprises;
      
      when the non-textual content included in the area of interest has been classified as a first face, performing facial recognition to determine an identity associated with the first face;
      
      when the non-textual content included in the area of interest has been classified as a first object, identifying the first object;
      
      when the non-textual content included in the area of interest has been classified as a first landscape, determining a location associated with the first landscape.
  - 20. The computer-implemented system according to claim 1, wherein:
    - classifying content included in the area of interest into at least one of the plurality of different types of content comprises classifying the content included in the area of interest as a first face; and
      
      performing the second action comprises identifying any contacts of a user that correspond to the first face.

4. A non-transitory, computer-readable medium storing instructions, the instructions configured to cause at least one processor to perform operations comprising:
- receiving at least one image including a gesture made by a user;
  
  analyzing the at least one image to identify the gesture made by the user in the at least one image;
  
  determining, based on the gesture, a first action to perform on the at least one image;
  
  determining a selection area for the gesture;
  
  identifying an area of interest in the at least one image based on the determined selection area of the gesture, wherein the area of interest includes non-textual content; and
  
  performing the first action on the identified area of interest, wherein the first action comprises;
  
  classifying the non-textual content included in the area of interest into at least one of a plurality of different types of non-textual content into which the non-textual content is classifiable by the at least one processor, wherein the instructions are configured to cause the at least one processor to be capable of recognizing each of a face, an object, and a landscape; and
  
  generating a first result that indicates the type of non-textual content into which the content included in the area of interest was classified;
  
  determining a second action to be performed on the identified area of interest based at least in part on the at least one type of non-textual content into which the non-textual content included in the area of interest was classified; and
  
  performing the second action on the identified area of interest.
- View Dependent Claims (5, 6)
- - 5. The computer-readable medium according to claim 4, wherein the selection area for the gesture is determined by a shape and a location of the gesture in the at least one image.
  - 6. The computer-readable medium according to claim 4, wherein:
    - determining the second action comprises determining, based on the gesture and the first result, the second action to be performed on the identified area of interest.

7. A method comprising the following operations performed by one or more processors:
- receiving at least one image including a single gesture made by a user;
  
  analyzing the at least one image to identify the single gesture made by the user in the at least one image;
  
  determining, based on the single gesture, a first action to perform on the at least one image;
  
  determining a selection area indicated by the single gesture, such that both of the first action and the selection area are determined based on the single gesture made by the user in the at least one image;
  
  identifying an area of interest in the at least one image based on the determined selection area indicated by the gesture; and
  
  performing the first action on the identified area of interest, wherein performing the first action comprises recognizing, by the one or more processors that are capable of recognizing each of a face, an object, or a landscape, at least one of the face, the object, or the landscape within the area of interest.
- View Dependent Claims (8, 9)
- - 8. The method according to claim 7, wherein the selection area for the single gesture is determined by a shape and a location of the single gesture in the at least one image.
  - 9. The method according to claim 7, wherein the first action results in a first result and the method further comprises:
    - determining, based on the single gesture and the first result, a second action; and
      
      performing the second action.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Alvarez, Raziel
Primary Examiner(s)
Chang, Kent
Assistant Examiner(s)
Edwards, Mark

Application Number

US14/596,168
Publication Number

US 20160203360A1
Time in Patent Office

1,197 Days
Field of Search
US Class Current
CPC Class Codes

G06F 3/017   Gesture based interaction, ...

G06F 3/0304   Detection arrangements usin...

G06F 3/04842   Selection of displayed obje...

G06F 3/04845   for image manipulation, e.g...

G06V 10/235   based on user input or inte...

G06V 40/28   Recognition of hand or arm ...

Systems and methods for performing actions in response to user gestures in captured images

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for performing actions in response to user gestures in captured images

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links