Method for visual-based recognition of an object

US 7,831,087 B2
Filed: 10/31/2003
Issued: 11/09/2010
Est. Priority Date: 10/31/2003
Status: Expired due to Fees

First Claim

Patent Images

1. A method for visual-based recognition of an object, said method comprising:

receiving digital depth data for at least a pixel of an image of an object, which is not required to be inside of a subject, said depth data comprising information relating to a distance from a visual sensor to a portion of said object shown at said pixel, said visual sensor comprising an emitter and sensor of light, wherein said light is selected from the group of electromagnetic radiation consisting of visible light, infrared light, and ultraviolet light and wherein said receiving of said depth data does not require special behavior from one of said object and said subject;

generating a two dimensional plan-view image based in part on said depth data, wherein said generating includes generating said plan-view image as if said object was viewed at an axis normal to ground level from above and wherein generating other view images based on different orientations of said object other than at said axis normal to ground level from above is not required;

extracting a plan-view template from said plan-view image, wherein at least a portion of said plan-view image is transformed; and

processing said plan-view template at a classifier, that is executing on a computer system, to assign a class to said plan-view template, wherein said classifier is trained to make a decision according to pre-configured parameters determined at least in part based on said class of said plan-view template.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for visual-based recognition of objects is described. Depth data for at least a pixel of an image of the object is received, the depth data comprising information relating to a distance from visual sensor to a portion of the object visible at the pixel. At least one plan-view image is generated based on the depth data. At least one plan-view template is extracted from the plan-view image. The plan-view template is processed by at least one classifier, wherein the classifiers are trained to make a decision according to pre-configured parameters.

Citations

40 Claims

1. A method for visual-based recognition of an object, said method comprising:
- receiving digital depth data for at least a pixel of an image of an object, which is not required to be inside of a subject, said depth data comprising information relating to a distance from a visual sensor to a portion of said object shown at said pixel, said visual sensor comprising an emitter and sensor of light, wherein said light is selected from the group of electromagnetic radiation consisting of visible light, infrared light, and ultraviolet light and wherein said receiving of said depth data does not require special behavior from one of said object and said subject;
  
  generating a two dimensional plan-view image based in part on said depth data, wherein said generating includes generating said plan-view image as if said object was viewed at an axis normal to ground level from above and wherein generating other view images based on different orientations of said object other than at said axis normal to ground level from above is not required;
  
  extracting a plan-view template from said plan-view image, wherein at least a portion of said plan-view image is transformed; and
  
  processing said plan-view template at a classifier, that is executing on a computer system, to assign a class to said plan-view template, wherein said classifier is trained to make a decision according to pre-configured parameters determined at least in part based on said class of said plan-view template.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 2. The method as recited in claim 1 further comprising receiving non-depth data for said pixel.
  - 3. The method as recited in claim 1 wherein said visual sensor determines said depth data using stereopsis based on image correspondences.
  - 4. The method as recited in claim 1 wherein said generating said plan-view image comprises selecting a subset of said depth data based on foreground segmentation.
  - 5. The method as recited in claim 1 wherein said generating said plan-view image further comprises:
    - generating a three-dimensional point cloud of a subset of pixels based on said depth data, wherein a point of said three-dimensional point cloud comprises a three-dimensional coordinate;
      
      partitioning said three-dimensional point cloud into a plurality of vertically oriented bins; and
      
      mapping at least a portion of points of said plurality of vertically oriented bins into at least one said plan-view image based on said three-dimensional coordinates, wherein said plan-view image is a two-dimensional representation of said three-dimensional point cloud comprising at least one pixel corresponding to at least one vertically oriented bin of said plurality of vertically oriented bins.
  - 6. The method as recited in claim 4 further comprising receiving non-depth data for said pixel, and wherein said foreground segmentation is based at least in part on said non-depth data.
  - 7. The method as recited in claim 5 further comprising dividing said three-dimensional point cloud into a plurality of slices, and wherein said generating said plan-view image is performed for at least one slice of said plurality of slices.
  - 8. The method as recited in claim 7 wherein said extracting a plan-view template from said plan-view image further comprises extracting a plan view template from at least two plan-view images corresponding to different slices of said plurality of slices, wherein said plan-view template comprises a transformation of at least said portion of said plan-view images, such that said plan-view template is processed at said classifier.
  - 9. The method as recited in claim 1 wherein said extracting said plan-view template from said plan-view image is based at least in part on object tracking.
  - 10. The method as recited in claim 1 wherein said classifier is a support vector machine.
  - 11. The method as recited in claim 2 wherein said plan-view image is based in part on said non-depth data.
  - 12. The method as recited in claim 1 wherein said object is a person.
  - 13. The method as recited in claim 1 wherein said plan-view image comprises a value based at least in part on an estimate of height of a portion of said object above a surface.
  - 14. The method as recited in claim 1 wherein said plan-view image comprises a value based at least in part on color data for a portion of said object.
  - 15. The method as recited in claim 1 wherein said plan-view image comprises a value based at least in part on a count of pixels obtained by said visual sensor and associated with said object.
  - 16. The method as recited in claim 1 wherein said plan-view template is represented in terms of a vector basis.
  - 17. The method as recited in claim 16 wherein said vector basis is obtained through principal component analysis (PCA).
  - 18. The method as recited in claim 13 further comprising performing height normalization on said plan-view template.
  - 19. The method as recited in claim 1 wherein said decision is to distinguish between a human and a non-human.
  - 20. The method as recited in claim 1 wherein said decision is to distinguish between a plurality of different human body orientations.
  - 21. The method as recited in claim 1 wherein said decision is to distinguish between a plurality of different human body poses.
  - 22. The method as recited in claim 1 wherein said decision is to distinguish between a plurality of different classes of people.

23. A visual-based recognition system comprising:
- a visual sensor for capturing depth data for at least a pixel of an image of an object, which is not required to be inside of a subject, said depth data comprising information relating to a distance from said visual sensor to a portion of said object visible at said pixel, said visual sensor comprising an emitter and sensor of light, wherein said light is selected from the group of electromagnetic radiation consisting of visible light, infrared light, and ultraviolet light and wherein said capturing of said depth data does not require special behavior from one of said object and said subject;
  
  a plan-view image generator for generating a two dimensional plan-view image based on said depth data, wherein said generating of said plan-view image includes generating said plan-view image as if said object was viewed at an axis normal to ground level from above and wherein generating other view images based on different orientations of said object other than at said axis normal to ground level from above is not required;
  
  a plan-view template generator for generating a plan-view template based on said plan-view image; and
  
  a classifier for making a decision concerning recognition of said object, wherein said classifier is trained to make said decision according to pre-configured parameters that were determined at least in part based on a class assigned to said plan-view template.
- View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31)
- - 24. The visual-based recognition system as recited in claim 23 wherein said visual sensor is also for capturing non-depth data.
  - 25. The visual-based recognition system as recited in claim 23 wherein said visual sensor determines said depth data using stereopsis based on image correspondences.
  - 26. The visual-based recognition system as recited in claim 23 wherein said plan-view image generator comprises a pixel subset selector for selecting a subset of pixels of said image, wherein said pixel subset selector is operable to select said subset of pixels based on foreground segmentation.
  - 27. The visual-based recognition system as recited in claim 23 wherein said classifier is a support vector machine.
  - 28. The visual-based recognition system as recited in claim 24 wherein said plan-view image is based in part on said non-depth data.
  - 29. The visual-based recognition system as recited in claim 23 wherein said plan-view image generator is operable to generate a three-dimensional point cloud based on said depth data, wherein a point of said three-dimensional point cloud comprises a three-dimensional coordinate.
  - 30. The visual-based recognition system as recited in claim 29 wherein said plan-view image generator is operable to divide said three-dimensional point cloud into a plurality of slices such that a plan-view image may be generated for at least one slice of said plurality of slices.
  - 31. The visual-based recognition system as recited in claim 30 wherein said plan-view template generator is operable to extract a plan-view template from at least two plan-view images corresponding to different slices of said plurality of slices, wherein said plan-view template comprises a transformation of at least said portion of said plan-view images, such that said plan-view template is processed at said classifier.

32. A method for visual-based recognition of an object representative in an image, said method comprising:
- generating a three-dimensional point cloud based on digital depth data for at least a pixel of an image of said object, which is not required to be inside of a subject, said depth data comprising information relating to a distance from a visual sensor to a portion of said object visible at said pixel, said visual sensor comprising an emitter and sensor of light, wherein said light is selected from the group of electromagnetic radiation consisting of visible light, infrared light, and ultraviolet light, said three-dimensional point cloud representing a foreground surface visible to said visual sensor and wherein a pixel of said three-dimensional point cloud comprises a three-dimensional coordinate and wherein said generating of said three-dimensional point cloud does not require special behavior from one of said object and said subject;
  
  partitioning said three-dimensional point cloud into a plurality of vertically oriented bins;
  
  mapping at least a portion of points of said vertically oriented bins into at least one said plan-view image based on said three-dimensional coordinates, wherein said plan-view image is a two-dimensional representation of said three-dimensional point cloud comprising at least one pixel corresponding to at least one vertically oriented bin of said plurality of vertically oriented bins, wherein said mapping includes generating said plan-view image as if said object was viewed at an axis normal to ground level from above; and
  
  processing said plan-view image at a classifier, that is executing on a computer system, wherein said classifier is trained to make a decision according to pre-configured parameters and wherein said pre-configured parameters were determined based at least in part on a class assigned to a plan-view template that was extracted from said plan-view image by transforming at least a portion of said plan-view image, said classifier does not require other view images based on different orientations than at said axis normal to ground level from above of said object in order to make said decision.
- View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40)
- - 33. The method as recited in claim 32 wherein said three-dimensional point cloud and said plan-view image are also based at least in part on non-depth data.
  - 34. The method as recited in claim 32 wherein said visual sensor determines said depth data using stereopsis based on image correspondences.
  - 35. The method as recited in claim 32 wherein said plan view template comprises a transformation of at least said portion of said plan view image, and such that said plan-view template is processed at said classifier.
  - 36. The method as recited in claim 32 further comprising dividing said three-dimensional point cloud of into a plurality of slices, and wherein said mapping at least a portion of points comprises mapping points within a slice of said plurality of slices of said three-dimensional point cloud into said plan-view image.
  - 37. The method as recited in claim 36 wherein said plan view template comprises a transformation of at least said portion of said plan view image, such that said plan-view template is processed at said classifier.
  - 38. The method as recited in claim 32 wherein said classifier is a support vector machine.
  - 39. The method as recited in claim 32 wherein said plan-view image is generated from a subset of pixels of said image selected based on foreground segmentation.
  - 40. The method as recited in claim 36 further comprising extracting a plan view template from at least two plan view images corresponding to different slices of said plurality of slices, wherein said plan view template comprises a transformation of at least said portion of said plan view images, such that said plan-view template is processed at said classifier.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Original Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Inventors
Harville, Michael
Primary Examiner(s)
Mehta; Bhavesh M
Assistant Examiner(s)
Krasnic; Bernard

Application Number

US10/698,111
Publication Number

US 20050094879A1
Time in Patent Office

2,566 Days
Field of Search

382/209, 382/103, 382/154
US Class Current

382/154
CPC Class Codes

G06V 10/507   Summing image-intensity val...

G06V 20/64   Three-dimensional objects

G06V 40/103   Static body considered as a...

G06V 40/161   Detection; Localisation; No...

Method for visual-based recognition of an object

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

40 Claims

Specification

Solutions

Use Cases

Quick Links

Method for visual-based recognition of an object

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

40 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links