Apparatus and methods for distance estimation using stereo imagery
First Claim
1. A non-transitory computer-readable storage medium having computer readable instructions stored thereon, that when executed by at least one processor causes the at least one processor to,produce a sequence of composite images, each one of a respective sequence of composite images comprising a first image from a first sequence of images and a second image from a second sequence of images, the first image and the second image being joined adjacent to each other;
- and evaluate the sequence of composite images to determine a depth parameter of a scene, the evaluating of the sequence of composite images comprising encoding the sequence of composite images into an encoded frame sequence, the encoded frame sequence comprising disparity estimates and motion estimates obtained based on (i) a first set of images from the first sequence of images occurring at one or more time frames and (ii) a second set of images from the second sequence of images occurring at the one or more time frames,wherein the sequence of composite images includes a first composite image and a second composite image, the first composite image is based on a combination of the second image from the second sequence of images and a plurality of replicas of the first image from the first sequence of images, the second composite image subsequent to the first composite image within the sequence of composite images is based on a combination of a third image from the second sequence of images and a plurality of replicas of a fourth image from the first sequence of images, the third image from the second sequence of images is acquired contemporaneously with the first image from the first sequence of images and subsequent to the second image from the second sequence of images, and the fourth image from the first sequence of images is acquired subsequent to the first image from the first sequence of images.
2 Assignments
0 Petitions
Accused Products
Abstract
Frame sequences from multiple image sensors may be combined in order to form, for example, an interleaved frame sequence. Individual frames of the combined sequence may be configured a by combination (e.g., concatenation) of frames from one or more source sequences. The interleaved/concatenated frame sequence may be encoded using a motion estimation encoder. Output of the video encoder may be processed (e.g., parsed) in order to extract motion information present in the encoded video. The motion information may be utilized in order to determine a depth of visual scene, such as by using binocular disparity between two or more images by an adaptive controller in order to detect one or more objects salient to a given task. In one variant, depth information is utilized during control and operation of mobile robotic devices.
193 Citations
23 Claims
-
1. A non-transitory computer-readable storage medium having computer readable instructions stored thereon, that when executed by at least one processor causes the at least one processor to,
produce a sequence of composite images, each one of a respective sequence of composite images comprising a first image from a first sequence of images and a second image from a second sequence of images, the first image and the second image being joined adjacent to each other; - and evaluate the sequence of composite images to determine a depth parameter of a scene, the evaluating of the sequence of composite images comprising encoding the sequence of composite images into an encoded frame sequence, the encoded frame sequence comprising disparity estimates and motion estimates obtained based on (i) a first set of images from the first sequence of images occurring at one or more time frames and (ii) a second set of images from the second sequence of images occurring at the one or more time frames,
wherein the sequence of composite images includes a first composite image and a second composite image, the first composite image is based on a combination of the second image from the second sequence of images and a plurality of replicas of the first image from the first sequence of images, the second composite image subsequent to the first composite image within the sequence of composite images is based on a combination of a third image from the second sequence of images and a plurality of replicas of a fourth image from the first sequence of images, the third image from the second sequence of images is acquired contemporaneously with the first image from the first sequence of images and subsequent to the second image from the second sequence of images, and the fourth image from the first sequence of images is acquired subsequent to the first image from the first sequence of images. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- and evaluate the sequence of composite images to determine a depth parameter of a scene, the evaluating of the sequence of composite images comprising encoding the sequence of composite images into an encoded frame sequence, the encoded frame sequence comprising disparity estimates and motion estimates obtained based on (i) a first set of images from the first sequence of images occurring at one or more time frames and (ii) a second set of images from the second sequence of images occurring at the one or more time frames,
-
10. An image processing system, comprising:
- an input interface configured to receive a stereo representation of a visual scene, the stereo representation comprising a first portion and a second portion;
a logic component in communication with the input interface and configured to;
arrange the first portion of the stereo representation with the second portion of the stereo representation into a concatenated frame; and
form a sequence of concatenated frames by arranging first portions of the stereo representation and second portions of the stereo representation within a first concatenated frame in an alternate order relative to a preceding concatenated frame within the sequence, the first concatenated frame comprising a different size from either the first portion or the second portion of the stereo representation;
wherein the sequence of concatenated frames further includes a first composite image and a second composite image;
the first composite image of the sequence of concatenated frames is based on a combination of an image from the second sequence of images and a plurality of replicas of an image from the first sequence of images;
the second composite image subsequent to the first composite image within the sequence of concatenated frames, the second composite image is based on a combination of another image from the second sequence of images and a plurality of replicas of another image from the first sequence of images;the another image from the second sequence of images is acquired contemporaneously with the image from the first sequence of images, and subsequent to the image from the second sequence of images; and
the another image from the first sequence of images is acquired subsequent to the image from the first sequence of images;a video encoder in data communication with the logic component and configured to encode the sequence of concatenated frames to produce a sequence of compressed frames; and
a processor in data communication with the video encoder and configured to execute computer readable instructions to obtain motion information based on an evaluation of the compressed frames. - View Dependent Claims (11, 12, 13, 14)
- an input interface configured to receive a stereo representation of a visual scene, the stereo representation comprising a first portion and a second portion;
-
15. An image processing apparatus system, comprising:
- computerized logic configured to;
receive a plurality of stereo representations, each of the plurality being representative of a corresponding visual scene and comprising a first portion and a second portion;
combine a first portion of a given first stereo representation at a first time point with a second portion of the given first stereo representation into a first frame in a first relative arrangement;combine a first portion of another stereo representation at a second time point with a second portion of the other stereo representation into a second frame in a second relative arrangement different from the first relative arrangement; and
form a sequence comprising at least the first and second frames;
produce a sequence of composite image, the sequence of composite images including a first composite image and a second composite image, wherein, the first composite image of the sequence of composite images is based on a combination of an image from the second sequence of images and a plurality of replicas of an image from the first sequence of images, the second composite image subsequent to the first composite image within the sequence of composite images, the second composite image is based on a combination of another image from the second sequence of images and a plurality of replicas of another image from the first sequence of images, the another image from the second sequence of images is acquired contemporaneously with the image from the first sequence of images, and subsequent to the image from the second sequence of images, and the another image from the first sequence of images is acquired subsequent to the image from the first sequence of images; and
a video encoder in data communication with the computerized logic and configured to encode the sequence of frames to produce a sequence of encoded frames; and
processing logic in data communication with the video encoder and configured to evaluate the sequence of encoded frames to determine motion information.
- computerized logic configured to;
-
16. A method of determining motion information within a visual scene, the method comprising:
- producing a first composite frame and a second composite frame by combining images from a first plurality of images and a second plurality of images of the visual scene;
producing an interleaved sequence of composite frames comprising the first and the second composite frames; and
evaluating the interleaved sequence of composite frames to determine a stream of encoded frames comprising the motion information, the motion information comprising (i) information associated with a comparison of an image from the first plurality of images and an image from the second plurality of images, (ii) information associated a comparison of two images from the first plurality of images, and (iii) information associated with a comparison of two images from the second plurality of images;wherein individual images of the first and second pluralities of images are provided by first and second sensing apparatus, respectively, the second sensing apparatus being separated spatially from the first sensing apparatus, and wherein the sequence of composite images includes a first composite image and a second composite image, the first composite image of the sequence of composite images is based on a combination of an image from the second sequence of images and a plurality of replicas of an image from the first sequence of images, the second composite image subsequent to the first composite image within the sequence of composite images, the second composite image is based on a combination of another image from the second sequence of images and a plurality of replicas of another image from the first sequence of images, the another image from the second sequence of images is acquired contemporaneously with the image from the first sequence of images, and subsequent to the image from the second sequence of images, and the another image from the first sequence of images is acquired subsequent to the image from the first sequence of images. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
- producing a first composite frame and a second composite frame by combining images from a first plurality of images and a second plurality of images of the visual scene;
Specification