Methods and apparatus for performing digital image and video segmentation and compression using 3-D depth information
First Claim
1. An apparatus for encoding fields or frames of video information comprising a two dimensional array of pixels, and using a depth component of each of said pixels to enhance encoding, comprising:
- (a) an encoder for receiving frames or fields of video information and generating a compressed video signal from said received frames or fields of video information, said encoder including a multi-mode quantizer for quantizing data which corresponds to a portion of said fields or frames of video information;
(b) an object segmentation circuit for receiving depth information which corresponds to said received video information and generating an object map to associate each pixel of said received field or frame with one of one or more regions of varying perceptual importance within said received frame or field; and
(c) a rate controller, coupled to said object segmentation circuit and to said multi-mode quantizer, for receiving said object map and for providing a signal, responsive to said object map, to said multi-mode quantizer to select a quantization mode therein, such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said object map.
1 Assignment
0 Petitions
Accused Products
Abstract
Apparatus and methods for identifying one or more separate objects within depth information which corresponds to a fleid or a frame of video information are disclosed. In a preferred embodiment, an appratus includes an object map generation circuit for receiving depth information and for converting depth information into an object map to associate each pixel within the frame of video information with one of one or more regions of varying perceptual importance is disclosed. This preferred apparatus also includes a region masking circuit for masking the object map to generate one or more depth region masks indicative of pixels within the frame which substantially correspond to preselected regions of depth, and a video object selection circuit for identifying one or more separate objects within each of the one or more preselected regions indicated by each of the one or more region masks, such that each object associated with each depth region is identified as a separate object.
-
Citations
49 Claims
-
1. An apparatus for encoding fields or frames of video information comprising a two dimensional array of pixels, and using a depth component of each of said pixels to enhance encoding, comprising:
-
(a) an encoder for receiving frames or fields of video information and generating a compressed video signal from said received frames or fields of video information, said encoder including a multi-mode quantizer for quantizing data which corresponds to a portion of said fields or frames of video information; (b) an object segmentation circuit for receiving depth information which corresponds to said received video information and generating an object map to associate each pixel of said received field or frame with one of one or more regions of varying perceptual importance within said received frame or field; and (c) a rate controller, coupled to said object segmentation circuit and to said multi-mode quantizer, for receiving said object map and for providing a signal, responsive to said object map, to said multi-mode quantizer to select a quantization mode therein, such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said object map. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus for encoding fields or frames of video information comprising a two dimensional array of pixels, and using a depth component of each of said pixels to enhance encoding, comprising:
-
(a) a depth sensing camera capable of generating in real-time both frames or fields of video information and depth information which corresponds to said video information; (b) an encoder, coupled to said depth sensing camera and receiving said generated frames or fields of video information, for generating a compressed video signal from said frames or fields of video information, said encoder including a multi-mode quantizer for quantizing data which corresponds to a portion of said fields or frames of video information; (c) an object segmentation circuit, coupled to said depth sensing camera and receiving said generated depth information, for generating an object map to associate each pixel of said received field or frame with one of one or more regions of varying perceptual importance within said received frame or field; and (d) a rate controller, coupled to said object segmentation circuit and to said multi-mode quantizer, for receiving said object map and for providing a signal, responsive to said object map, to said multi-mode quantizer to select a quantization mode therein, such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said object map.
-
-
12. A method for encoding fields or frames of video information comprising a two dimensional array of pixels using a depth component of each of said pixels to enhance encoding, comprising the steps of:
-
(a) receiving frames or fields of video information and depth information which corresponds to said received video information; (b) converting said received three information into an object map to thereby associate each pixel of said received field or frame with one of one or more regions of varying perceptual importance within said received frame or field; (c) generating a quantization mode signal based on said object map to select a quantization mode reflective of said perceptual importance of said regions indicated by said object map; and (d) generating a compressed video signal which corresponds to said received frames or fields of video information by quantizing data which corresponds to a portion of said received fields or frames of video information in accordance with said quantization mode selected by said quantization mode signal. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. An apparatus for performing object-based encoding of video information using a depth component of said video information to enhance encoding, comprising:
-
(a) an object segmentation circuit for receiving depth information for a frame of video information and generating one or more object identification signals based on said received depth information indicative of a shape of one or more objects within said frame of video information; and (b) an encoder, coupled to said object segmentation circuit, for receiving the frame of video information which corresponds to said received depth information and said one or more object identification signals, and for encoding a video signal representing only a portion of said video information, which portion substantially corresponds to said one or more objects identified by said one or more object identification signals. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31)
-
-
32. An apparatus for performing object-based encoding of video information using a depth component of said video information to enhance encoding, comprising:
-
(a) a depth sensing camera capable of generating in real-time both frames of video information and depth information which corresponds to said video information; (b) an object segmentation circuit, coupled to said depth sensing camera and receiving said generated depth information, for generating one or more object identification signals based on said received depth information indicative of a shape of one or more objects within said frame of video information; and (c) an encoder, coupled to said object segmentation circuit and to said depth sensing camera and receiving said generated frame of video information which corresponds to said received depth information and said one or more object identification signals, for encoding a video signal representing only a portion of said video information, which portion substantially corresponds to said one or more objects identified by said one or more object identification signals.
-
-
33. A method for performing object-based encoding of video information using a depth component of said video information to enhance encoding, comprising the steps of:
-
(a) receiving frames of video information and depth information which corresponds to said received video information; (b) generating one or more object identification signals based on said received depth information indicative of a shape of one or more objects within said frame of video information; and (c) encoding a video signal representing only a portion of said received video information, which portion substantially corresponds to said one or more objects identified by said one or more object identification signals. - View Dependent Claims (34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
-
-
45. An object segmentation circuit for receiving depth information which corresponds to a frame of video information, and for identifying one or more separate objects within said frame of video information, comprising:
-
(a) a object map generation circuit for receiving said depth information, and for converting said depth information into an object map to thereby associate each pixel within said frame of video information with one of one or more regions of varying perceptual importance within said frame, wherein said object map generation circuit comprises; (1) a histogram generation circuit for receiving said depth information and for computing a histogram of said depth information to thereby provide the number of pixels which have a predetermined depth value for a range of predetermined values; (2) a first logic circuit, coupled to said histogram generation circuit, for receiving said generated histogram and for setting all values in said histogram which are below a predetermined threshold value to zero to thereby generate a clipped histogram; (3) a second logic circuit, coupled to said first logic circuit, for receiving said clipped histogram and for scanning said clipped histogram to find boundaries of n regions with n different threshold depth values; and (4) a variable step quantization circuit, coupled to said second logic circuit, for receiving said n different threshold values and said depth information, and for quantizing said depth information based on said n different threshold values, to thereby generate said object map; (b) a region masking circuit, coupled to said object map generation circuit and receiving said object map, for masking said object map to generate one or more depth region masks indicative of pixels within said frame which substantially correspond to preselected regions of depth; and (c) a video object selection circuit, coupled to said region masking circuit and receiving said generated one or more region masks, for identifying one or more separate objects within each of said one or more preselected regions indicated by each of said one or more region masks, such that each object associated with each depth region is identified as a separate object. - View Dependent Claims (46)
-
-
47. A method for identifying one or more separate objects within depth information which corresponds to a frame of video information, comprising the steps of:
-
(a) receiving said depth information; (b) converting said received depth information into an object map to thereby associate each pixel within said frame of video information with one of one or more regions of varying perceptual importance within said frame, wherein said converting step comprises the steps of; (1) computing a histogram of said received depth information to thereby provide the number of pixels which have a predetermined depth value for a range of predetermined values; (2) setting all values in said histogram which are below a predetermined threshold value to zero to thereby generate a clipped histogram; (3) scanning said clipped histogram to find boundaries of n regions with n different threshold depth values; and (4) quantizing said depth information based on said n different threshold values; (c) masking said object map to generate one or more depth region masks indicative of pixels within said frame which substantially correspond to preselected regions of depth; and (d) identifying one or more separate objects within each of said one or more preselected regions indicated by said one or more region masks, such that each object associated with each depth region is identified as a separate object. - View Dependent Claims (48, 49)
-
Specification