Signal detection, recognition and tracking with feature vector transforms

US 9,858,681 B2
Filed: 10/27/2015
Issued: 01/02/2018
Est. Priority Date: 10/27/2014
Status: Active Grant

First Claim

Patent Images

1. A method of obtaining surface detail of an object from a video sequence captured by a moving camera over the object, the method comprising:

providing a camera model and the video sequence;

using a hardware processor of a computer system, determining pose estimation from the video sequence using the camera model;

using a hardware processor of a computer system, registering images from different frames using the pose estimation;

using a hardware processor of a computer system, performing a feature vector transform on the images to produce N-dimensional feature vector per pixel of the images, the feature vector transform producing for each pixel in an array of pixels, a first vector component corresponding to plural comparisons between a center pixel and pixels at plural directions around the center pixel for a first scale, and second vector component corresponding to plural comparisons between the center pixel and pixels at plural directions around the center pixel for a second scale;

using a hardware processor of a computer system, correlating the feature vector transforms of the images to obtain shift measurements between the images; and

using a hardware processor of a computer system, obtaining surface height detail of the object from the shift measurements.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for obtaining object surface topology in which image frames of a scene (e.g., video frames from a user passing a smartphone camera over an object) are transformed into dense feature vectors, and feature vectors are correlated to obtain high precision depth maps. Six dimensional pose is determined from the video sequence, and then used to register patches of pixels from the frames. Registered patches are aligned and then correlated to local shifts. These local shifts are converted to precision depth maps, which are used to characterize surface detail of an object. Feature vector transforms are leveraged in a signal processing method comprising several levels of interacting loops. At a first loop level, a structure from motion loop process extracts anchor features from image frames. At another level, an interacting loop process extracts surface texture, as noted. At additional levels, object forms are segmented from the images, and objects are counted and/or measured. At still a higher level, the lower level data structures providing feature extraction, 3D structure and pose estimation, and object surface registration are exploited by higher level loop processes for object identification (e.g., using machine learning classification), digital watermark or bar code reading and image recognition from the registered surfaces stored in lower level data structures.

18 Citations

View as Search Results

18 Claims

1. A method of obtaining surface detail of an object from a video sequence captured by a moving camera over the object, the method comprising:
- providing a camera model and the video sequence;
  
  using a hardware processor of a computer system, determining pose estimation from the video sequence using the camera model;
  
  using a hardware processor of a computer system, registering images from different frames using the pose estimation;
  
  using a hardware processor of a computer system, performing a feature vector transform on the images to produce N-dimensional feature vector per pixel of the images, the feature vector transform producing for each pixel in an array of pixels, a first vector component corresponding to plural comparisons between a center pixel and pixels at plural directions around the center pixel for a first scale, and second vector component corresponding to plural comparisons between the center pixel and pixels at plural directions around the center pixel for a second scale;
  
  using a hardware processor of a computer system, correlating the feature vector transforms of the images to obtain shift measurements between the images; and
  
  using a hardware processor of a computer system, obtaining surface height detail of the object from the shift measurements.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1 wherein the determining of pose estimation comprises:
    - performing a feature vector transform on frames of the video sequence, the feature vector transform producing for each pixel in an array of pixels, a vector component corresponding to plural comparisons between a center pixel and pixels at plural directions around the center pixel;
      
      using a hardware processor, finding shifts between a first feature vector transformed frame and at least a second feature vector transformed frame; and
      
      using a hardware processor, determining the pose estimation from the shifts.
  - 3. The method of claim 1 wherein the plural comparisons at the first and second scales comprise quantized differences.
  - 4. The method of claim 3 wherein the quantized differences are encoded in arcs of a ring at the first and second scales.
  - 5. The method of claim 1 wherein the plural comparisons at each of the first and second scales are converted to a gradient.
  - 6. The method of claim 5 wherein the gradient comprises a magnitude and direction to produce at least two vector components per scale.
  - 7. The method of claim 1 wherein providing the video sequence comprises obtaining the video sequence from a mobile device camera, which captures the video sequence as the mobile device camera is moved over the object.
  - 8. The method of claim 1 wherein the hardware processor comprises a hardware processor in a mobile device comprising the mobile device camera.

9. A non-transitory computer readable medium on which is stored instructions, which when executed by one or more processors, perform a method of obtaining surface detail of an object from a video sequence captured by a moving camera over the object, the method comprising:
- determining pose estimation from the video sequence using a camera model;
  
  registering images from different frames using the pose estimation;
  
  performing a feature vector transform on the images to produce N-dimensional feature vector per pixel of the images, the feature vector transform producing for each pixel in an array of pixels, a first vector component corresponding to plural comparisons between a center pixel and pixels at plural directions around the center pixel for a first scale, and second vector component corresponding to plural comparisons between the center pixel and pixels at plural directions around the center pixel for a second scale;
  
  correlating the feature vector transforms of the images to obtain shift measurements between the images; and
  
  obtaining surface height detail of the object from the shift measurements.
- View Dependent Claims (10, 11, 12, 13, 14)
- - 10. The computer readable medium of claim 9 wherein the determining of pose estimation comprises:
    - performing a feature vector transform on frames of the video sequence, the feature vector transform producing for each pixel in an array of pixels, a vector component corresponding to plural comparisons between a center pixel and pixels at plural directions around the center pixel;
      
      finding shifts between a first feature vector transformed frame and at least a second feature vector transformed frame; and
      
      determining the pose estimation from the shifts.
  - 11. The computer readable medium of claim 9 wherein the plural comparisons at the first and second scales comprise quantized differences.
  - 12. The computer readable medium of claim 11 wherein the quantized differences are encoded in arcs of a ring at the first and second scales.
  - 13. The computer readable medium of claim 9 wherein the plural comparisons at each of the first and second scales are converted to a gradient.
  - 14. The computer readable medium of claim 13 wherein the gradient comprises a magnitude and direction to produce at least two vector components per scale.

15. A mobile device comprising:
- a camera for capturing a video sequence of an object, the video sequence comprising plural frames;
  
  a processor programmed with instructions that configure the processor to;
  
  determine pose estimation from the video sequence using a camera model;
  
  align images from different frames using the pose estimation;
  
  perform a feature vector transform on the images to produce N-dimensional feature vectors per pixel of the images, the feature vector transform producing for each pixel in an array of pixels, a first vector component corresponding to plural comparisons between a center pixel and pixels at plural directions around the center pixel for a first scale, and second vector component corresponding to plural comparisons between the center pixel and pixels at plural directions around the center pixel for a second scale;
  
  correlate the feature vector transforms of the images to obtain shift measurements between the images; and
  
  obtain surface height detail of the object from the shift measurements.

16. A system for obtaining surface detail of an object from a video sequence captured by a moving camera over the object, the system comprising:
- means for estimating pose of the object relative to the camera from the video sequence;
  
  means for transforming the images into dense feature vector arrays, the feature vector arrays comprising a feature vector per pixel, the feature vector having a first vector component corresponding to plural comparisons between a center pixel and pixels at plural directions around the center pixel for a first scale, and second vector component corresponding to plural comparisons between the center pixel and pixels at plural directions around the center pixel for a second scale; and
  
  means for obtaining surface height detail of the object from the dense feature vector arrays.
- View Dependent Claims (17, 18)
- - 17. The system of claim 16 wherein the means for estimating pose comprises a processor programmed with instructions to:
    - determine a coarse 6D pose from the video sequence based on a camera model;
      
      obtain dense feature vector transforms of images in the video sequence;
      
      align the feature vector transforms with the coarse 6D pose; and
      
      determine a refined 6D pose from the aligned feature vector transforms.
  - 18. The system of claim 16 wherein the means for obtaining surface height detail comprises a processor programmed with instructions to:
    - obtain shift measurements between the images from the dense vector arrays; and
      
      obtain surface height detail of the object from the shift measurements.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Digimarc Corporation
Original Assignee
Digimarc Corporation
Inventors
Rhoads, Geoffrey B.
Primary Examiner(s)
Koziol, Stephen R
Assistant Examiner(s)
Azima, Shaghayegh

Application Number

US14/924,664
Publication Number

US 20160189381A1
Time in Patent Office

798 Days
Field of Search
US Class Current
CPC Class Codes

G06T 2207/20016   Hierarchical, coarse-to-fin...

G06T 7/33   using feature-based methods

G06T 7/579   from motion

Signal detection, recognition and tracking with feature vector transforms

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

18 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Signal detection, recognition and tracking with feature vector transforms

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

18 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links