Analytics-modulated coding of surveillance video
First Claim
1. A method, comprising:
- receiving, at an analytics module implemented in at least one of a memory or a processing device, a video frame having a plurality of pixels;
assigning, at the analytics module and without user intervention, based on a type of a foreground object from the video frame having the plurality of pixels, a class from a plurality of predetermined classes to the foreground object;
adjusting a quantization parameter value associated with the foreground object based on a weight associated with the class assigned to the foreground object, a size of the foreground object, and a target bit rate associated with the video frame, the weight being based on a coding priority associated with the class assigned to the foreground object;
producing a plurality of DCT coefficients for pixels from the plurality of pixels of the video frame associated with the foreground object;
quantizing, at a quantization module, the DCT coefficients associated with the foreground object based on the adjusted quantization parameter value;
coding the quantized DCT coefficients associated with the foreground object to produce coded quantized DCT coefficients; and
sending, to a storage module, a representation of the coded quantized DCT coefficients.
5 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for encoding surveillance video where one or more regions of interest are identified and the encoding parameter values associated with those regions are specified in accordance with intermediate outputs of a video analytics process. Such an analytics-modulated video compression approach allows the coding process to adapt dynamically based on the content of the surveillance images. In this manner, the fidelity of the region of interest is increased relative to that of a background region such that the coding efficiency is improved, including instances when no target objects appear in the scene. Better compression results can be achieved by assigning different coding priority levels to different types of detected objects.
142 Citations
33 Claims
-
1. A method, comprising:
-
receiving, at an analytics module implemented in at least one of a memory or a processing device, a video frame having a plurality of pixels; assigning, at the analytics module and without user intervention, based on a type of a foreground object from the video frame having the plurality of pixels, a class from a plurality of predetermined classes to the foreground object; adjusting a quantization parameter value associated with the foreground object based on a weight associated with the class assigned to the foreground object, a size of the foreground object, and a target bit rate associated with the video frame, the weight being based on a coding priority associated with the class assigned to the foreground object; producing a plurality of DCT coefficients for pixels from the plurality of pixels of the video frame associated with the foreground object; quantizing, at a quantization module, the DCT coefficients associated with the foreground object based on the adjusted quantization parameter value; coding the quantized DCT coefficients associated with the foreground object to produce coded quantized DCT coefficients; and sending, to a storage module, a representation of the coded quantized DCT coefficients. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method, comprising:
-
receiving, at an analytics module implemented in at least one of a memory or a processing device, a first video frame having a plurality of blocks of pixels and a second video frame having a plurality of blocks of pixels; assigning, at the analytics module and without user intervention, a class, based on a type of a foreground object from the first video frame, from a plurality of predetermined classes to the foreground object, the foreground object including a block of pixels from the plurality of blocks of pixels of the first video frame, each class from the plurality of predetermined classes having associated therewith a coding priority; identifying in the second video frame a prediction block of pixels associated with the block of pixels in the foreground object, the identifying being based on a prediction search window having a search area associated with the class assigned to the foreground object; coding, at a coding module, the first video frame based on the identified prediction block of pixels, a size of the foreground object, and a target bit rate associated with the first video frame to produce a coded video flame; and sending, to a storage module, a representation of the coded video frame. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
-
19. A non-transitory processor-readable:
- medium storing code representing instructions to be executed by a processor, the code comprising code to cause the processor to;
assign, without user intervention, based on a type of a foreground object from a video frame having a plurality of pixels, a class from a plurality of predetermined classes to the foreground object, each class from the plurality of predetermined classes having associated therewith a coding priority; track motion information associated with the foreground object in a first video frame having a plurality of blocks of pixels, the foreground object including, a block of pixels from the plurality of blocks of pixels of the first video frame; identity in a second video frame having a plurality of blocks of pixels a prediction block of pixels associated with the block of pixels in the foreground object, the identification of the prediction block of pixels based on a prediction search window having a search area associated with the tracked motion information associated with the foreground object, the search area of the prediction search window is updated according to the class assigned to the foreground object; code the first video frame based on the identified prediction block of pixels and a target bit rate associated with the first video frame to produce a coded Video frame; and send, to a storage module, a representation of the coded video frame. - View Dependent Claims (20, 21, 22)
- medium storing code representing instructions to be executed by a processor, the code comprising code to cause the processor to;
-
23. A method, comprising:
-
receiving, at an analytics module implemented in at least one of a memory or a processing device, a plurality of pictures associated with a scene; assigning, at the analytics module and without user intervention, based on a type of a foreground object from a picture in a first group of pictures (GOP) from the plurality of pictures, a class from a plurality of predetermined classes to the foreground object, each class from the plurality of predetermined classes having associated therewith a coding priority, the first GOP (1) having a first number of frames between two intra-frames, and (2) associated with the scene at a first time; tracking, at a tracking module, motion information associated with the foreground object over a plurality of pictures; inserting an intra-frame picture in the first GOP based on the tracked motion information associated with the foreground object and the coding priority associated with the class assigned to the foreground object; defining a second GOP from the plurality of pictures and associated with the scene at a second time after the first time to have a second number of frames between two intra-frames based on the foreground object leaving the scene after the first time and before the second time, the second number of frames being different than the first number of frames; and sending, to a storage module, a representation of at least one of the first GOP or the second GOP. - View Dependent Claims (24, 25)
-
-
26. A method, comprising:
-
receiving, at an analytics module implemented in at least one of a memory or a processing device, a plurality of pictures associated with a scene; assigning, at the analytics module and without user intervention, based on a type of a foreground object from a picture in a first group of pictures (GOP) from the plurality of pictures, a class from a plurality of predetermined classes to the foreground object, each class from the plurality of predetermined classes having associated therewith a coding priority, the first group of pictures (1) having a first number of frames between two intra-frames, and (2) associated with the scene at a first time; tracking, at a tracking module, motion information associated with the foreground object over a plurality of pictures; replacing a block of pixels in the foreground object with an intra-frame block of pixels based on the tracked motion information associated with the foreground object and the coding priority associated with the class assigned to the foreground object; defining a second GOP from the plurality of pictures and associated with the scene at a second time after the first time to have a second number of frames between two intra-frames based on the foreground object leaving the scene after the first time and before the second time, the first number of frames being different than the second number of frames; and sending, to a storage module, a representation of the second GOP.
-
-
27. A method, comprising:
-
receiving, at an analytics module implemented in at least one of a memory or a processing device, a group of pictures (GOP); segmenting a foreground object from a background of a picture in the GOP the foreground object of the picture having a plurality of pixels organized into a plurality of blocks of pixels, the background of the picture having a plurality of pixels organized into a plurality of blocks of pixels; tracking motion information associated with a block of pixels from the plurality of blocks of pixels of the foreground object, a first block of pixels from the plurality of blocks of pixels of the background, and a second block of pixels from the plurality of blocks of pixels of the background; encoding, at an encode module, the block of pixels from the plurality of blocks of pixels of the foreground object as an intra-coded block of pixels to produce an encoded intra-coded block of pixels based on (1) the motion information associated with the block of pixels from the plurality of blocks of pixels of the foreground object, (2) a size of the foreground object, and (3) a target bit rate associated with the GOP; encoding, at the encode module, the first block of pixels from the plurality of blocks of pixels of the background as a predictive-coded block of pixels to produce an encoded predictive coded block of pixels based on the motion information associated with the first block of pixels from the plurality of blocks of pixels of the background and the target bit rate associated with the GOP; encoding, at the encode module, the second block of pixels from the plurality of blocks of pixels of the background as a skipped block of pixels to produce an encoded skipped block of pixels based on the motion information associated with the second block of pixels from the plurality of blocks of pixels of the background and the target bit rate associated with the GOP; and sending, to a storage module, a representation of at least one of the encoded intra-coded block of pixels, the encoded predictive-coded block of pixels, or the encoded skipped block of pixels. - View Dependent Claims (28)
-
-
29. An apparatus, comprising:
-
a receive module implemented in at least one of a memory or a processing device the receive module configured to receive a video frame having a plurality of pixels; a segment module configured to segment, from the video frame, a foreground object from a background, the foreground object being from a plurality of foreground objects, the foreground object of the video frame having a plurality of pixels organized into a plurality of blocks of pixels, the background of the video frame having a plurality of pixels organized into a plurality of Hocks of pixels; a track module configured to track motion information associated with the block of pixels from the plurality of blocks of pixels of the foreground, a first block of pixels from the plurality of blocks of pixels of the background, and a second block of pixels from the plurality of blocks of pixels of the background; an encode module configured to encode the block of pixels from the plurality of blocks of pixels of the foreground object as an intra-coded macroblock based on (1) the motion information associated with the block of pixels from the plurality of blocks of pixels of the foreground object, (2) a quantity of foreground objects from the plurality of foreground objects, and (3) a target bit rate associated with the video frame to produce an encoded intra-coded macroblock; an encode module configured to encode the first block of pixels from the plurality of blocks of pixels of the background as a predictive-coded macroblock based on the motion information associated with the first block of pixels from the plurality of blocks of pixels of the background and the target hit rate associated with the video frame to produce an encoded predictive-coded macroblock; an encode module configured to encode the second block of pixels from the plurality of blocks of pixels of the background as at least one of a bidirectionally-predictive coded macroblock or a skipped macroblock based on the motion information associated with the second block of pixels from the plurality of blocks of pixels of the background and the target bit rate associated with the video frame to produce at least one of an encoded bidirectionally-predictive coded macroblock or an encoded skipped macroblock; and a send module configured to send a representation of at least one of the encoded intra-coded macroblock, the encoded predictive-coded macroblock, or the encoded skipped macroblock. - View Dependent Claims (30, 31, 32, 33)
-
Specification