Robot with vision-based 3D shape recognition
First Claim
Patent Images
1. A method for processing two-dimensional (2D) video signals from a video sensor, in order to extract three-dimensional (3D) shape information invariant to pose and lighting changes on at least one physical property about a physical object with its environment represented in the video signals, the method comprising the steps of:
- in an unsupervised training phase, presenting, in an input field of a 2D video camera physical objects, used as 3D training objects, wherein different positions or a trajectory of each physical object is induced by a defined motion-including stimulus;
determining the physical properties of the 3D training objects from the object trajectory, wherein the physical properties include friction or movement type, wherein the trajectory is influenced by the shape of the physical object interacting with the environment;
extracting slowly varying features of different rotational views of the 3D training objects and forming clusters by clustering the extracted features in order to parameterize a shape space representation of the 3D training objects, the shape space being an abstract feature space encoding the 3D training objects'"'"' 3D shape properties;
providing storing, in a memory, the 3D training objects in a 3D shape space, the shape space being an abstract feature space encoding the 3D training objects'"'"' 3D shape properties; and
in an operation phase, mapping a 2D video signal representation of a 3D training object in the shape space, the coordinates of the 3D training object in relation to centers of the formed clusters of the clustered extracted features in the shape space indicating a similarity of the 2D video signal representation of the physical object to the 3D shape or a physical property of the trained 3D training objects, wherein the coordinates of the 3D training object include a distance of the representation of the 3D training object in the shape space to the cluster centers.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention relates to a method for processing video signals from a video sensor, in order to extract 3d shape information about objects represented in the video signals, the method comprising the following steps:
- providing a memory in which objects are stored in a 3d shape space, the shape space being an abstract feature space encoding the objects'"'"' 3d shape properties, and
- mapping a 2d video signal representation of an object in the shape space, the coordinates of the object in the shape space indicating the object'"'"'s 3d shape.
20 Citations
5 Claims
-
1. A method for processing two-dimensional (2D) video signals from a video sensor, in order to extract three-dimensional (3D) shape information invariant to pose and lighting changes on at least one physical property about a physical object with its environment represented in the video signals, the method comprising the steps of:
-
in an unsupervised training phase, presenting, in an input field of a 2D video camera physical objects, used as 3D training objects, wherein different positions or a trajectory of each physical object is induced by a defined motion-including stimulus; determining the physical properties of the 3D training objects from the object trajectory, wherein the physical properties include friction or movement type, wherein the trajectory is influenced by the shape of the physical object interacting with the environment; extracting slowly varying features of different rotational views of the 3D training objects and forming clusters by clustering the extracted features in order to parameterize a shape space representation of the 3D training objects, the shape space being an abstract feature space encoding the 3D training objects'"'"' 3D shape properties; providing storing, in a memory, the 3D training objects in a 3D shape space, the shape space being an abstract feature space encoding the 3D training objects'"'"' 3D shape properties; and in an operation phase, mapping a 2D video signal representation of a 3D training object in the shape space, the coordinates of the 3D training object in relation to centers of the formed clusters of the clustered extracted features in the shape space indicating a similarity of the 2D video signal representation of the physical object to the 3D shape or a physical property of the trained 3D training objects, wherein the coordinates of the 3D training object include a distance of the representation of the 3D training object in the shape space to the cluster centers. - View Dependent Claims (2)
-
-
3. A computing unit for an autonomous robot, comprising:
-
at least one processor; and at least one memory including computer program code, wherein the memory and the computer program code are configured to, with the processor, cause the computing unit to; in an unsupervised training phase, present, in an input field of a 2D video camera physical objects, used as 3D training objects, wherein different positions or a trajectory of each physical object is induced by a defined motion-including stimulus; determine the physical properties of the 3D training objects from the object trajectory, wherein the physical properties include friction or movement type, wherein the trajectory is influenced by the shape of the physical object interacting with the environment; extract slowly varying features of different rotational views of the 3D training objects and forming clusters by clustering the extracted features in order to parameterize a shape space representation of the 3D training objects, the shape space being an abstract feature space encoding the 3D training objects'"'"' 3D shape properties; provide storing, in a memory, the 3D training objects in a 3D shape space, the shape space being an abstract feature space encoding the 3D training objects'"'"' 3D shape properties; and in an operation phase, map a 2D video signal representation of a 3D training object in the shape space, the coordinates of the 3D training object in relation to centers of the formed clusters of the clustered extracted features in the shape space indicating a similarity of the 2D video signal representation of the physical object to the 3D shape or a physical property of the trained 3D training objects, wherein the coordinates of the 3D training object include a distance of the representation of the 3D training object in the shape space to the cluster centers. - View Dependent Claims (4)
-
-
5. A non-transitory computer readable medium for storing instructions, which, when run on a computing device, perform:
-
in an unsupervised training phase, presenting, in an input field of a 2D video camera physical objects, used as 3D training objects, wherein different positions or a trajectory of each physical object is induced by a defined motion-including stimulus; determining the physical properties of the 3D training objects from the object trajectory, wherein the physical properties include friction or movement type, wherein the trajectory is influenced by the shape of the physical object interacting with the environment; extracting slowly varying features of different rotational views of the 3D training objects and forming clusters by clustering the extracted features in order to parameterize a shape space representation of the 3D training objects, the shape space being an abstract feature space encoding the 3D training objects'"'"' 3D shape properties; providing storing, in a memory, the 3D training objects in a 3D shape space, the shape space being an abstract feature space encoding the 3D training objects'"'"' 3D shape properties; and in an operation phase, mapping a 2D video signal representation of a 3D training object in the shape space, the coordinates of the 3D training object in relation to centers of the formed clusters of the clustered extracted features in the shape space indicating a similarity of the 2D video signal representation of the physical object to the 3D shape or a physical property of the trained 3D training objects, wherein the coordinates of the 3D training object include a distance of the representation of the 3D training object in the shape space to the cluster centers.
-
Specification