Data compression method and apparatus
First Claim
1. A method for calculating a perceptual distance between a data signal and a first representation of said data signal, comprising the steps of:
- partitioning a time ordered representation of said data signal using a windowing processor;
decomposing said partitioned data into multiple channels using a filter bank, each channel representing partitioned data components of respective frequency bands;
quantizing said decomposed data in a quantization engine;
reconstructing said decomposed data from said quantized data using a decoder;
determining an energy difference between said quantized data and said reconstructed data in a difference energy processor; and
adjusting said quantization engine based upon a comparison between a predetermined threshold and said energy difference.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for measuring the "perceptual distance" between an approximate, reconstructed representation of a sensory signal (such as an audio or video signal) and the original sensory signal is provided. The perceptual distance in this context is a direct quantitative measure of the likelihood that a human observer can distinguish the original audio or video signal from the reconstructed approximation to the original audio or video signal. The method described herein applies to noisy compression techniques; the method provides the ability to predict the likelihood that the reconstructed noisy representation of the original signal will be distinguishable by a human observer from the original input representation. The method can be used to allocate bits in audio and video compression algorithms such that the signal reconstructed from compressed representation is perceptually similar to the original input signal when judged by a human observer. The method is based on a theory of the neurophysiological limitations of human sensory perception. Specifically, a "neural encoding model" (NEM) summarizes the manner in which sensory signals are represented in the human brain. The NEM is analyzed in the context of detection theory which provides a mathematical framework for statistically quantifying the detectability of differences in the neural representation arising from differences in sensory input. This NEM approach has been validated by demonstrating its ability to predict a variety of published psychoacoustic data, including masking and many other phenomenon.
-
Citations
91 Claims
-
1. A method for calculating a perceptual distance between a data signal and a first representation of said data signal, comprising the steps of:
-
partitioning a time ordered representation of said data signal using a windowing processor;
decomposing said partitioned data into multiple channels using a filter bank, each channel representing partitioned data components of respective frequency bands;quantizing said decomposed data in a quantization engine; reconstructing said decomposed data from said quantized data using a decoder; determining an energy difference between said quantized data and said reconstructed data in a difference energy processor; and adjusting said quantization engine based upon a comparison between a predetermined threshold and said energy difference. - View Dependent Claims (2)
-
-
3. A method for calculating a perceptual distance between a data signal and a first representation of said data signal, comprising the steps of:
-
partitioning a time ordered representation of said data signal using a windowing processor; decomposing said partitioned data into vector coefficients of uniformly distributed frequency components using a discrete frequency transform; quantizing said decomposed data in a quantization engine; reconstructing said decomposed data from said quantized data using a decoder; determining an energy difference between said quantized data and said reconstructed data in a difference energy processor; and adjusting said quantization engine based upon a comparison between a predetermined threshold and said energy difference. - View Dependent Claims (4)
-
-
5. A compression system providing an intermediate, compressed representation of an original signal from which a final reconstruction of said original signal is to be generated, comprising:
a lossy coder for receiving said original signal and for generating said compressed representation based upon perceptual distance data calculated between a neural encoding model representation of an intermediate reconstruction of said original signal and a neural encoding model representation of said original signal. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54)
-
55. A data compression method for providing an intermediate, compressed representation of an original signal from which a final reconstruction of said original signal is to be generated, comprising:
-
receiving said original signal by a lossy coder; generating a neural encoding model representation of an intermediate reconstruction of said original signal by said lossy coder; generating a neural encoding model representation of said original signal by said lossy coder; calculating, by said lossy coder, a perceptual distance between said neural encoding model representation of an intermediate reconstruction of said original signal and said neural encoding model representation of said original signal; and requantizing, by said lossy coder, said original signal to form said intermediate, compressed representation of said original signal, based upon said perceptual distance. - View Dependent Claims (56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91)
-
Specification