Methods and apparatus for performing digital image and video segmentation and compression using 3-D depth information

US 6,055,330 A
Filed: 10/09/1996
Issued: 04/25/2000
Est. Priority Date: 10/09/1996
Status: Expired due to Fees

First Claim

Patent Images

1. An apparatus for encoding fields or frames of video information comprising a two dimensional array of pixels, and using a depth component of each of said pixels to enhance encoding, comprising:

(a) an encoder for receiving frames or fields of video information and generating a compressed video signal from said received frames or fields of video information, said encoder including a multi-mode quantizer for quantizing data which corresponds to a portion of said fields or frames of video information;

(b) an object segmentation circuit for receiving depth information which corresponds to said received video information and generating an object map to associate each pixel of said received field or frame with one of one or more regions of varying perceptual importance within said received frame or field; and

(c) a rate controller, coupled to said object segmentation circuit and to said multi-mode quantizer, for receiving said object map and for providing a signal, responsive to said object map, to said multi-mode quantizer to select a quantization mode therein, such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said object map.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Apparatus and methods for identifying one or more separate objects within depth information which corresponds to a fleid or a frame of video information are disclosed. In a preferred embodiment, an appratus includes an object map generation circuit for receiving depth information and for converting depth information into an object map to associate each pixel within the frame of video information with one of one or more regions of varying perceptual importance is disclosed. This preferred apparatus also includes a region masking circuit for masking the object map to generate one or more depth region masks indicative of pixels within the frame which substantially correspond to preselected regions of depth, and a video object selection circuit for identifying one or more separate objects within each of the one or more preselected regions indicated by each of the one or more region masks, such that each object associated with each depth region is identified as a separate object.

Citations

49 Claims

1. An apparatus for encoding fields or frames of video information comprising a two dimensional array of pixels, and using a depth component of each of said pixels to enhance encoding, comprising:
- (a) an encoder for receiving frames or fields of video information and generating a compressed video signal from said received frames or fields of video information, said encoder including a multi-mode quantizer for quantizing data which corresponds to a portion of said fields or frames of video information;
  
  (b) an object segmentation circuit for receiving depth information which corresponds to said received video information and generating an object map to associate each pixel of said received field or frame with one of one or more regions of varying perceptual importance within said received frame or field; and
  
  (c) a rate controller, coupled to said object segmentation circuit and to said multi-mode quantizer, for receiving said object map and for providing a signal, responsive to said object map, to said multi-mode quantizer to select a quantization mode therein, such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said object map.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The apparatus of claim 1, wherein said encoder is a variable bit rate encoder capable of generating an MPEG-2 compliant bit stream.
  - 3. The apparatus of claim 1, wherein said object segmentation circuit comprises:
    - (1) a histogram generation circuit for receiving said depth information and for computing a histogram of said depth information to thereby provide the number of pixels which have a predetermined depth value for a range of predetermined values;
      
      (2) a first logic circuit, coupled to said histogram generation circuit, for receiving said generated histogram and for setting all values in said histogram which are below a predetermined threshold value to zero to thereby generate a clipped histogram;
      
      (3) a second logic circuit, coupled to said first logic circuit, for receiving said clipped histogram and for scanning said clipped histogram to find boundaries of n regions with n different threshold depth values; and
      
      (4) a variable step quantization circuit, coupled to said second logic circuit, for receiving said n different threshold values and said depth information, and for quantizing said depth information based on said n different threshold values, to thereby generate said object map.
  - 4. The apparatus of claim 3, wherein said histogram generation circuit comprises:
    - (i) a buffer for receiving and temporarily storing said depth information;
      
      (ii) a memory, coupled to said buffer, for receiving said depth information from said buffer as memory addresses and for storing histogram values as said memory addresses; and
      
      (iii) a logic circuit, coupled to said memory, for reading a histogram value from said memory at an address location, updating said histogram value, and providing said updated histogram value to said memory at said address location.
  - 5. The apparatus of claim 1, wherein said encoder is a variable bit rate encoder, and further comprising:
    - (d) a video buffer having a preselected storage capacity, coupled to said encoder and to said rate controller, for receiving and temporarily storing said generated compressed video signal, and for providing a signal indicative of an overflow condition to said rate controller, wherein said signal provided by said rate controller to said multi-mode quantizer is also responsive to said overflow signal such that said selected quantization mode is reflective of said perceptual importance of said regions as constrained by said storage capacity of said video buffer.
  - 6. The apparatus of claim 5, further comprising:
    - (e) a macroblock labeling circuit, coupled to said object segmentation circuit and to said rate controller, for assigning a current macroblock of video data to one of said regions of varying perceptual importance and for providing a signal indicative of said assigned region to said rate controller, wherein said signal provided by said rate controller to said multi-mode quantizer is reflective of said assigned region.
  - 7. The apparatus of claim 6, wherein said current macroblock of video data is assigned to a region having the greatest perceptual importance of one or more regions identified by a location of said object map which corresponds to said macroblock of video data.
  - 8. The apparatus of claim 6, wherein said encoder generates an output signal when compressed video data is output to said video buffer, and wherein said macroblock labeling circuit generates a signal indicative of a target bit rate associated with said assigned region, further comprising:
    - (f) a clock signal generating circuit, coupled to said video buffer, for providing a clock signal to said buffer, wherein said video buffer outputs a predetermined amount of said compressed video signal in response to said clock signal; and
      
      (g) a counter, coupled to said encoder, to said clock signal generating circuit, to said macroblock labeling circuit and to said rate controller, for receiving said clock signal, said target bit rate signal and said encoder output signal, for counting the number of bits that are in a virtual buffer associated with said video buffer by adding to said count in response to said encoder output signal and subtracting from said count in response to said clock signal and to said target bit rate signal to thereby determine an occupancy of said virtual buffer, and for providing a virtual buffer occupancy signal indicative of said count to said rate controller;
      
      wherein said signal provided by said rate controller to said multi-mode quantizer is also responsive to said virtual buffer occupancy signal such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said virtual buffer capacity as constrained by said storage capacity of said video buffer.
  - 9. The apparatus of claim 8, wherein said virtual buffer occupancy B_i is determined by the equation B_i =B_i-1 +b_i -r(R_k /R), where b_i is equal to the number of bits used to encode the present macroblock indicated by said encoder output signal, r is equal to the number of bits output by said video buffer indicated by said clock signal, R_k is the target bit rate indicated by said target bit rate signal, and R is the average output bit rate to be maintained by said video buffer.
  - 10. The apparatus of claim 8, wherein said rate controller further comprises a buffer size logic circuit, coupled to said macroblock labeling circuit, for receiving said assigned region and for generating a buffer size modulation signal whenever said assigned region is different from an immediately proceeding assigned region, wherein said signal provided by said rate controller to said multi-mode quantizer is also responsive to said buffer size modulation signal such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said virtual buffer capacity as modulated by said buffer size modulation signal and as constrained by said storage capacity of said video buffer.

11. An apparatus for encoding fields or frames of video information comprising a two dimensional array of pixels, and using a depth component of each of said pixels to enhance encoding, comprising:
- (a) a depth sensing camera capable of generating in real-time both frames or fields of video information and depth information which corresponds to said video information;
  
  (b) an encoder, coupled to said depth sensing camera and receiving said generated frames or fields of video information, for generating a compressed video signal from said frames or fields of video information, said encoder including a multi-mode quantizer for quantizing data which corresponds to a portion of said fields or frames of video information;
  
  (c) an object segmentation circuit, coupled to said depth sensing camera and receiving said generated depth information, for generating an object map to associate each pixel of said received field or frame with one of one or more regions of varying perceptual importance within said received frame or field; and
  
  (d) a rate controller, coupled to said object segmentation circuit and to said multi-mode quantizer, for receiving said object map and for providing a signal, responsive to said object map, to said multi-mode quantizer to select a quantization mode therein, such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said object map.

12. A method for encoding fields or frames of video information comprising a two dimensional array of pixels using a depth component of each of said pixels to enhance encoding, comprising the steps of:
- (a) receiving frames or fields of video information and depth information which corresponds to said received video information;
  
  (b) converting said received three information into an object map to thereby associate each pixel of said received field or frame with one of one or more regions of varying perceptual importance within said received frame or field;
  
  (c) generating a quantization mode signal based on said object map to select a quantization mode reflective of said perceptual importance of said regions indicated by said object map; and
  
  (d) generating a compressed video signal which corresponds to said received frames or fields of video information by quantizing data which corresponds to a portion of said received fields or frames of video information in accordance with said quantization mode selected by said quantization mode signal.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
- - 13. The method of claim 12, wherein said compressed video signal generated in step (d) is an MPEG-2 compliant bit stream.
  - 14. The method of claim 12, wherein said converting step comprises:
    - (1) computing a histogram of said received depth information to thereby provide the number of pixels which have a predetermined depth value for a range of predetermined values;
      
      (2) setting all values in said histogram which are below a predetermined threshold value to zero to thereby generate a clipped histogram;
      
      (3) scanning said clipped histogram to find boundaries of n regions with n different threshold depth values; and
      
      (4) quantizing said depth information based on said n different threshold values.
  - 15. The method of claim 12, wherein said generated compressed video signal is a variable bit rate signal, and further comprising the steps of:
    - (e) buffering said generated compressed video signal; and
      
      (f) providing a signal indicative of a buffering overflow condition, wherein said quantization mode signal is also responsive to said overflow signal such that said selected quantization mode is reflective of said perceptual importance of said regions as constrained by a buffering limitation.
  - 16. The method of claim 15, further comprising the step of assigning, prior to generating said quantization mode signal, a current macroblock of video data within said received frames or fields of video information to one of said regions of varying perceptual importance, wherein said quantization mode signal is reflective of said assigned region.
  - 17. The method of claim 16, wherein said assigning step further comprises assigning said current macroblock of video data to an region having the greatest perceptual importance of one or more regions identified by a location of said object map which corresponds to said macroblock of video data.
  - 18. The method of claim 16, further comprising the step of generating a signal indicative of a target bit rate associated with said assigned region prior to generating said quantization mode signal.
  - 19. The method of claim 18, further comprising the steps of:
    - (i) adding to a virtual buffer count indicative of a virtual buffer occupancy whenever compressed video signal information is buffered;
      
      (ii) subtracting from said count whenever buffered compressed video signal information is output in an amount which dependant on said target bit rate signal; and
      
      (iii) generating a virtual buffer occupancy signal indicative of said count, wherein said quantization mode signal is also responsive to said virtual buffer occupancy signal such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said virtual buffer occupancy as constrained by said buffering limitation.
  - 20. The method of claim 19, wherein said virtual buffer occupancy B_i is determined by the equation B_i =B_i-1 +b_i -r(R_k /R), where b_i is equal to the number of bits used to encode the present macroblock indicated by said encoder output signal, r is equal to the number of bits output by said video buffer indicated by said clock signal, R_k is the target bit rate indicated by said target bit rate signal, and R is the average output bit rate required to prevent said buffering limitation from occurring.
  - 21. The method of claim 16, further comprising the step of generating, prior to generating said quantization mode signal, a buffer size modulation signal whenever said assigned region is different from an immediately proceeding assigned region, wherein said quantization mode signal is also responsive to said buffer size modulation signal such that said selected quantization mode is reflective of said perceptual importance of said regions indicated by said virtual buffer capacity as modulated by said buffer size modulation signal and as constrained by said buffering limitation.

22. An apparatus for performing object-based encoding of video information using a depth component of said video information to enhance encoding, comprising:
- (a) an object segmentation circuit for receiving depth information for a frame of video information and generating one or more object identification signals based on said received depth information indicative of a shape of one or more objects within said frame of video information; and
  
  (b) an encoder, coupled to said object segmentation circuit, for receiving the frame of video information which corresponds to said received depth information and said one or more object identification signals, and for encoding a video signal representing only a portion of said video information, which portion substantially corresponds to said one or more objects identified by said one or more object identification signals.
- View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31)
- - 23. The apparatus of claim 22, wherein said object segmentation circuit comprises:
    - (1) a object map generation circuit for receiving said depth information and for converting said depth information into an object map to thereby associate each pixel within said frame of video information with one of one or more regions of varying perceptual importance within said frame;
      
      (2) a region masking circuit, coupled to said object map generation circuit and receiving said object map, for masking said object map to generate a depth region mask indicative of pixels within said frame which substantially correspond to a preselected region; and
      
      (3) a video object selection circuit, coupled to said region masking circuit and receiving said generated region mask, for identifying one or more separate objects within said preselected region indicated by said region mask and generating said one or more object identification signals, such that each one of said object identification signals identifies one of said one or more identified separate objects.
  - 24. The apparatus of claim 23, wherein said object map generation circuit comprises:
    - (i) a histogram generation circuit for receiving said depth information and for computing a histogram of said depth information to thereby provide the number of pixels which have a predetermined depth value for a range of predetermined values;
      
      (ii) a first logic circuit, coupled to said histogram generation circuit, for receiving said generated histogram and for setting all values in said histogram which are below a predetermined threshold value to zero to thereby generate a clipped histogram;
      
      (iii) a second logic circuit, coupled to said first logic circuit, for receiving said clipped histogram and for scanning said clipped histogram to find boundaries of n regions with n different threshold depth values; and
      
      (iv) a variable step quantization circuit, coupled to said second logic circuit, for receiving said n different threshold values and said depth information, and for quantizing said depth information based on said n different threshold values, to thereby generate said object map.
  - 25. The apparatus of claim 24, wherein said histogram generation circuit comprises:
    - (A) a buffer for receiving and temporarily storing said depth information;
      
      (B) a memory, coupled to said buffer, for receiving said depth information from said buffer as memory addresses and for storing histogram values as said memory addresses; and
      
      (C) a logic circuit, coupled to said memory, for reading a histogram value from said memory at an address location, updating said histogram value, and providing said updated histogram value to said memory at said address location.
  - 26. The apparatus of claim 22, wherein said encoder includes a multi-mode quantizer for quantizing data which corresponds to a portion of said received fields of video information, further comprising:
    - (c) a rate controller, coupled to said an object segmentation circuit and to said multi-mode quantizer, for receiving said one or more object identification signals and for providing a signal, responsive to said object identification signals, to said multi-mode quantizer to select a quantization mode therein, such that for each portion of data to be quantized, said selected quantization mode is reflective of a perceptual importance of an object identified by one of said one or more object identification signals which is associated with said portion of data to be quantized.
  - 27. The apparatus of claim 26, wherein said encoder is a variable bit rate encoder, and further comprising:
    - (d) a video buffer having a preselected storage capacity, coupled to said encoder and to said rate controller, for receiving and temporarily storing said generated compressed video signal, and for providing a signal indicative of an overflow condition to said rate controller, wherein said signal provided by said rate controller to said multi-mode quantizer is also responsive to said overflow signal such that said selected quantization mode is reflective of said perceptual importance of said object as constrained by said storage capacity of said video buffer.
  - 28. The apparatus of claim 27, wherein said portion of data to be quantized is a macroblock, further comprising:
    - (e) a macroblock labeling circuit, coupled to said object segmentation circuit and to said rate controller, for assigning a current macroblock of video data to one of said objects and for providing a signal indicative of said assigned object to said rate controller, wherein said signal provided by said rate controller to said multi-mode quantizer is reflective of said assigned object.
  - 29. The apparatus of claim 28, wherein said encoder generates an output signal when compressed video data is output to said video buffer, and wherein said macroblock labeling circuit generates a signal indicative of a target bit rate associated with said assigned object, further comprising:
    - (f) a clock signal generating circuit, coupled to said video buffer and providing a clock signal to said buffer, wherein said video buffer outputs a predetermined amount of said compressed video signal in response to said clock signal; and
      
      (g) a counter, coupled to said encoder, to said clock signal generating circuit, to said macroblock labeling circuit and to said rate controller, for receiving said clock signal, said target bit rate signal and said encoder output signal, for counting the number of bits that are in a virtual buffer associated with said video buffer by adding to said count in response to said encoder output signal and subtracting from said count in response to said clock signal and to said target bit rate signal to thereby determine an occupancy of said virtual buffer, and for providing a virtual buffer occupancy signal indicative of said count to said rate controller;
      
      wherein said signal provided by said rate controller to said multi-mode quantizer is also responsive to said virtual buffer occupancy signal such that said selected quantization mode is reflective of said perceptual importance of said object indicated by said virtual buffer capacity as constrained by said storage capacity of said video buffer.
  - 30. The apparatus of claim 29, wherein said virtual buffer occupancy B_i is determined by the equation B_i =B_i-1 +b_i -r(R_k /R), where b_i is equal to the number of bits used to encode the present macroblock indicated by said encoder output signal, r is equal to the number of bits output by said video buffer indicated by said clock signal, R_k is the target bit rate indicated by said target bit rate signal, and R is the average output bit rate to be maintained by said video buffer.
  - 31. The apparatus of claim 29, wherein said rate controller further comprises a buffer size logic circuit, coupled to said macroblock labeling circuit, for receiving said assigned object and for generating a buffer size modulation signal whenever said assigned object is different from an immediately proceeding assigned object, wherein said signal provided by said rate controller to said multi-mode quantizer is also responsive to said buffer size modulation signal such that said selected quantization mode is reflective of said perceptual importance of said object indicated by said virtual buffer capacity as modulated by said buffer size modulation signal and as constrained by said storage capacity of said video buffer.

32. An apparatus for performing object-based encoding of video information using a depth component of said video information to enhance encoding, comprising:
- (a) a depth sensing camera capable of generating in real-time both frames of video information and depth information which corresponds to said video information;
  
  (b) an object segmentation circuit, coupled to said depth sensing camera and receiving said generated depth information, for generating one or more object identification signals based on said received depth information indicative of a shape of one or more objects within said frame of video information; and
  
  (c) an encoder, coupled to said object segmentation circuit and to said depth sensing camera and receiving said generated frame of video information which corresponds to said received depth information and said one or more object identification signals, for encoding a video signal representing only a portion of said video information, which portion substantially corresponds to said one or more objects identified by said one or more object identification signals.

33. A method for performing object-based encoding of video information using a depth component of said video information to enhance encoding, comprising the steps of:
- (a) receiving frames of video information and depth information which corresponds to said received video information;
  
  (b) generating one or more object identification signals based on said received depth information indicative of a shape of one or more objects within said frame of video information; and
  
  (c) encoding a video signal representing only a portion of said received video information, which portion substantially corresponds to said one or more objects identified by said one or more object identification signals.
- View Dependent Claims (34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
- - 34. The method of claim 33, wherein step (b) comprises the steps of:
    - (1) converting said received depth information into an object map to thereby associate each pixel within said frame of video information with one of one or more regions of varying perceptual importance within said frame;
      
      (2) masking said object map to generate a depth region mask indicative of pixels within said frame which substantially correspond to a preselected region;
      
      (3) identifying one or more separate objects within said preselected region indicated by said depth region mask; and
      
      (4) generating said one or more object identification signals such that each one of said object identification signals identifies one of said one or more identified separate objects.
  - 35. The method of claim 34, wherein said converting step comprises the steps of:
    - (i) computing a histogram of said received depth information to thereby provide the number of pixels which have a predetermined depth value for a range of predetermined values;
      
      (ii) setting all values in said histogram which are below a predetermined threshold value to zero to thereby generate a clipped histogram;
      
      (iii) scanning said clipped histogram to find boundaries of n regions with n different threshold depth values; and
      
      (iv) quantizing said depth information based on said n different threshold values.
  - 36. The method of claim 34, wherein said identifying step comprises the steps of:
    - (i) scanning said depth region mask until a pixel with a nonmasked value is found;
      
      (ii) searching neighboring pixels within said mask to find any other neighboring pixels with nonmasked values;
      
      (iii) repeating said searching step until no neighboring pixels have a nonmasked value to identify all neighboring found pixels with nonmasked values as a video object plane which corresponds to an object within said depth region;
      
      (iv) masking said object from said depth region mask; and
      
      (v) repeating steps (i)-(iv) until all pixels within said depth region mask are masked to thereby identify one or more video object planes within said received frame of video information.
  - 37. The method of claim 36, wherein said identifying step further comprises the steps of:
    - (vi) selecting one of said one or more video object planes, and one of one or more video object planes associated with an immediately preceding frame of video information;
      
      (vii) comparing said selected video object plane and said selected previous frame video object plane to determine a depth difference therebetween;
      
      (viii) repeating step (vii) after selecting a different one of said one or more previous frame video object planes unless all of said one or more previous frame video object planes have been selected;
      
      (ix) assigning said selected video object plane to a video object which corresponds to one of said one or more previous frame video object planes for which a depth difference therebetween is minimized as compared to all of said determined depth differences; and
      
      (x) repeating steps (vii)-(ix) after selecting a different one of said one or more video object planes unless all of said one or more video object planes have been selected, so that each of said one or more video object planes identifies an object.
  - 38. The method of claim 33, wherein step (c) includes quantizing data which corresponds to a portion of said received fields of video information, and further comprising the step of(d) generating a quantization mode signal based on an object identification signal which corresponds to said data to select a quantization mode reflective of a perceptual importance of an object indicated by said object identification signal, such that said data is quantized based on said selected quantization mode.
  - 39. The method of claim 38, wherein said generated compressed video signal is a variable bit rate signal, and further comprising the steps of:
    - (e) buffering said generated compressed video signal; and
      
      (f) providing a signal indicative of a buffering overflow condition, wherein said quantization mode signal is also responsive to said overflow signal such that said selected quantization mode is reflective of said perceptual importance of said object as constrained by a buffering limitation.
  - 40. The method of claim 39, further comprising the step of assigning, prior to generating said quantization mode signal, a current macroblock of video data within said received frames or fields of video information to one of said objects, wherein said quantization mode signal is reflective of said assigned object.
  - 41. The method of claim 40, further comprising the step of generating a signal indicative of a target bit rate associated with said assigned object prior to generating said quantization mode signal.
  - 42. The method of claim 41, further comprising the steps of:
    - (i) adding to a virtual buffer count indicative of a virtual buffer occupancy whenever compressed video signal information is buffered;
      
      (ii) subtracting from said count whenever buffered compressed video signal information is output in an amount which dependant on said target bit rate signal; and
      
      (iii) generating a virtual buffer occupancy signal indicative of said count, wherein said quantization mode signal is also responsive to said virtual buffer occupancy signal such that said selected quantization mode is reflective of said objects indicated by said virtual buffer occupancy as constrained by said buffering limitation.
  - 43. The method of claim 42, wherein said virtual buffer occupancy B_i is determined by the equation B_i =B_i-1 +b_i -r(R_k /R), where b_i is equal to the number of bits used to encode the present macroblock indicated by said encoder output signal, r is equal to the number of bits output by said video buffer indicated by said clock signal, R_k is the target bit rate indicated by said target bit rate signal, and R is the average output bit rate required to prevent said buffering limitation from occurring.
  - 44. The method of claim 40, further comprising the step of generating, prior to generating said quantization mode signal, a buffer size modulation signal whenever said assigned object is different from an immediately proceeding assigned object, wherein said quantization mode signal is also responsive to said buffer size modulation signal such that said selected quantization mode is reflective of said object indicated by said virtual buffer capacity as modulated by said buffer size modulation signal and as constrained by said buffering limitation.

45. An object segmentation circuit for receiving depth information which corresponds to a frame of video information, and for identifying one or more separate objects within said frame of video information, comprising:
- (a) a object map generation circuit for receiving said depth information, and for converting said depth information into an object map to thereby associate each pixel within said frame of video information with one of one or more regions of varying perceptual importance within said frame, wherein said object map generation circuit comprises;
  
  (1) a histogram generation circuit for receiving said depth information and for computing a histogram of said depth information to thereby provide the number of pixels which have a predetermined depth value for a range of predetermined values;
  
  (2) a first logic circuit, coupled to said histogram generation circuit, for receiving said generated histogram and for setting all values in said histogram which are below a predetermined threshold value to zero to thereby generate a clipped histogram;
  
  (3) a second logic circuit, coupled to said first logic circuit, for receiving said clipped histogram and for scanning said clipped histogram to find boundaries of n regions with n different threshold depth values; and
  
  (4) a variable step quantization circuit, coupled to said second logic circuit, for receiving said n different threshold values and said depth information, and for quantizing said depth information based on said n different threshold values, to thereby generate said object map;
  
  (b) a region masking circuit, coupled to said object map generation circuit and receiving said object map, for masking said object map to generate one or more depth region masks indicative of pixels within said frame which substantially correspond to preselected regions of depth; and
  
  (c) a video object selection circuit, coupled to said region masking circuit and receiving said generated one or more region masks, for identifying one or more separate objects within each of said one or more preselected regions indicated by each of said one or more region masks, such that each object associated with each depth region is identified as a separate object.
- View Dependent Claims (46)
- - 46. The apparatus of claim 45, wherein said histogram generation circuit comprises:
    - (i) a buffer for receiving and temporarily storing said depth information;
      
      (ii) a memory, coupled to said buffer, for receiving said depth information from said buffer as memory addresses and for storing histogram values as said memory addresses; and
      
      (iii) a logic circuit, coupled to said memory, for reading a histogram value from said memory at an address location, updating said histogram value, and providing said updated histogram value to said memory at said address location.

47. A method for identifying one or more separate objects within depth information which corresponds to a frame of video information, comprising the steps of:
- (a) receiving said depth information;
  
  (b) converting said received depth information into an object map to thereby associate each pixel within said frame of video information with one of one or more regions of varying perceptual importance within said frame, wherein said converting step comprises the steps of;
  
  (1) computing a histogram of said received depth information to thereby provide the number of pixels which have a predetermined depth value for a range of predetermined values;
  
  (2) setting all values in said histogram which are below a predetermined threshold value to zero to thereby generate a clipped histogram;
  
  (3) scanning said clipped histogram to find boundaries of n regions with n different threshold depth values; and
  
  (4) quantizing said depth information based on said n different threshold values;
  
  (c) masking said object map to generate one or more depth region masks indicative of pixels within said frame which substantially correspond to preselected regions of depth; and
  
  (d) identifying one or more separate objects within each of said one or more preselected regions indicated by said one or more region masks, such that each object associated with each depth region is identified as a separate object.
- View Dependent Claims (48, 49)
- - 48. The method of claim 47, wherein said identifying step comprises the steps of:
    - (i) scanning a preselected depth region mask until a pixel with a nonmasked value is found;
      
      (ii) searching neighboring pixels within said mask to find any other neighboring pixels with nonmasked values;
      
      (iii) repeating said searching step until no neighboring pixels have a nonmasked value to identify all neighboring found pixels with nonmasked values as a video object plane which corresponds to an object within said depth region;
      
      (iv) masking said object from said depth region mask; and
      
      (v) repeating steps (i)-(iv) until all pixels within said depth region mask are masked to thereby identify one or more video object planes within said received frame of video information.
  - 49. The method of claim 48, wherein said identifying step further comprises the steps of:
    - (vi) selecting one of said one or more video object planes, and one of one or more video object planes associated with an immediately preceding frame of video information;
      
      (vii) comparing said selected video object plane and said selected previous frame video object plane to determine a depth difference therebetween;
      
      (viii) repeating step (vii) after selecting a different one of said one or more previous frame video object planes unless all of said one or more previous frame video object planes have been selected;
      
      (ix) assigning said selected video object plane to a video object which corresponds to one of said one or more previous frame video object planes for which a depth difference therebetween is minimized as compared to all of said determined depth differences; and
      
      (x) repeating steps (vii)-(ix) after selecting a different one of said one or more video object planes unless all of said one or more video object planes have been selected, so that each of said one or more video object planes identifies an object.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Trustees Of Columbia University In The City Of New York (Columbia University)
Original Assignee
Trustees Of Columbia University In The City Of New York (Columbia University)
Inventors
Anastassiou, Dimitris, Chang, Shif-Fu, Eleftheriadis, Alexandros, Nayar, Shree
Primary Examiner(s)
Lee, Thomas D.
Assistant Examiner(s)
BRINICH, STEPHEN M

Application Number

US08/723,467
Time in Patent Office

1,294 Days
Field of Search

382/154, 382/236, 382/171, 382/173, 382/255, 382/291, 382/240, 382/251, 348/42-50, 348/397, 348/398, 348/405, 348/419
US Class Current

382/154
CPC Class Codes

G06T 9/007   Transform coding, e.g. disc...

H04N 19/124   Quantisation

H04N 19/14   Coding unit complexity, e.g...

H04N 19/149   by estimating the code amou...

H04N 19/15   by monitoring actual compre...

H04N 19/152   by measuring the fullness o...

H04N 19/176   the region being a block, e...

H04N 19/20   using video object coding

H04N 19/597   specially adapted for multi...

Methods and apparatus for performing digital image and video segmentation and compression using 3-D depth information

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

49 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and apparatus for performing digital image and video segmentation and compression using 3-D depth information

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

49 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links