×

Motion-assisted visual language for human computer interfaces

  • US 9,829,984 B2
  • Filed: 11/20/2013
  • Issued: 11/28/2017
  • Est. Priority Date: 05/23/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for recognizing a visual gesture, the method comprising:

  • receiving a visual gesture formed by a part of a human body, the visual gesture being captured in a video having a plurality of video frames;

    determining a region of interest (ROI) in the plurality of video frames of the video based on motion vectors associated with the part of the human body, a centroid of the ROI aligned to be a centroid of a cluster of the motion vectors;

    selecting a visual gesture recognition process based on a user selection of a visual gesture recognition process from a plurality of visual gesture recognition processes;

    applying the selected visual gesture recognition process to the plurality of video frames to recognize the visual gesture;

    determining variations in the centroid, shape, and size of an object within the ROI of the plurality of video frames, the centroid, shape, and size of the object changing according to motion of the object in the plurality of video frames in an affine motion model, wherein said determination of the variations in the centroid, shape and size of the object within the ROI is performed by a track-learning-detection-type (TLD-type) process, wherein the TLD-type process is a signal processing scheme in which following functions are performed simultaneously;

    object tracking, by use of motion estimation in the affine motion model, either using optical flow, or block-based motion estimation and employing estimation error metrics comprising a sum of absolute differences (SAD) and normalized correlation coefficient (NCC);

    object feature learning, which automatically learns features of objects within the ROI, the features including size, centroids, statistics and edges; and

    object detection comprising;

    feature extraction employing edge analysis, spatial transforms, and background subtraction,feature analysis employing clustering and vector quantization, andfeature matching employing signal matching using similarity metrics, neural networks, support vector machines, and maximum posteriori probability; and

    deriving three or more dimensional information and relationships of objects contained in the visual gesture from the plurality of video frames capturing the visual gesture based on the analysis of the variations in the centroid, shape, and size of the object within the ROI.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×