×

Hand-gesture-based region of interest localization

  • US 9,778,750 B2
  • Filed: 09/30/2014
  • Issued: 10/03/2017
  • Est. Priority Date: 09/30/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method for localizing a region of interest in an ego-centric video using a hand gesture, comprising:

  • acquiring, by a processor, an image containing the hand gesture from the ego-centric video;

    detecting, by the processor, pixels that correspond to one or more hands in the image using a hand segmentation algorithm;

    identifying, by the processor, a hand enclosure in the pixels that are detected within the image, wherein the identifying comprises;

    generating, by the processor, a binary mask of the pixels that are detected within the image; and

    applying, by the processor, an image processing to the binary mask to reduce a probability of false positives and false negatives occurring in the binary mask, wherein the image processing comprises;

    detecting, by the processor, a plurality of inner contour holes that are located within a border around the pixels that correspond to the one or more hands in the image from a plurality of contour holes within a frame that include a plurality of outer contour holes that are located outside of the border in the binary mask;

    calculating, by the processor, a respective size percentage of each one of the plurality of inner contour holes, wherein the respective size percentage is calculated based on a length of a diagonal of a respective inner contour hole divided by a length of a diagonal of the frame;

    eliminating, by the processor, one or more of the plurality of inner contour holes that have the respective size percentage that are outside of a predefined range of size percentages; and

    identifying, by the processor, a single inner contour hole from a remaining plurality of inner contour holes that is closest to a center of the frame as the hand enclosure;

    localizing, by the processor, a region of interest based on the hand enclosure; and

    performing, by the processor, an action based on an object in the region of interest, wherein the object comprises text and the performing the action comprises;

    recognizing, by the processor, the text using an optical character recognition program; and

    automatically populating, by the processor, one or more fields of a form using the text that is identified, wherein the text comprises alphanumeric text on a license plate and the form comprises a citation for a traffic violation.

View all claims
  • 7 Assignments
Timeline View
Assignment View
    ×
    ×