Ordered data compression system and methods using principle component analysis

US 7,788,191 B2
Filed: 05/18/2005
Issued: 08/31/2010
Est. Priority Date: 12/26/2002
Status: Expired due to Fees

First Claim

Patent Images

1. A method for compressing data, comprising:

accessing data by a processor, the data being represented by vectors x, wherein each vector is an ordered set of pixels, and wherein each pixel comprises an n-tuple;

permuting the ordering of pixels in a set of data S=(x₁, . . . x_t), over a range of permutations to obtain a bag of unordered pixels representing the data, wherein the range of allowed permutations is linearly constrained;

associating a convex cost function with the permutations, wherein the cost function is derived by modeling data as a Gaussian distribution and identified as the determinant of a regularized covariance of the data;

minimizing the convex cost function over the constrained range of permutations to identify a linear subspace of the bag of pixels, wherein the linear subspace corresponds to the most likely invariant ordering of pixels in the set of data S;

performing Principle Component Analysis (PCA) to identify a set of N eigenvectors that span the linear subspace; and

encoding data designated for compression using the set of N eigenvectors to represent the designated data as a set of corresponding N coefficients in the eigenspace.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems are provided for encoding, transmission and decoding of vectorized input data, for example, video or audio data. A convex invariance learning framework is established for processing input data or a given data type. Each input vector is associated with a variable transformation matrix that acts on the vector to invariantly permute the vector elements. Joint invariance and model learning is performed on a training set of invariantly transformed vectors over a constrained space of transformation matrices using maximum likelihood analysis. The maximum likelihood analysis reduces the data volume to a linear subspace volume in which the training data can be modeled by a reduced number of variables. Principal component analysis is used to identify a set of N eigenvectors that span the linear subspace. The set of N eigenvectors is used a basis set to encode input data and to decode compressed data.

49 Citations

View as Search Results

19 Claims

1. A method for compressing data, comprising:
- accessing data by a processor, the data being represented by vectors x, wherein each vector is an ordered set of pixels, and wherein each pixel comprises an n-tuple;
  
  permuting the ordering of pixels in a set of data S=(x₁, . . . x_t), over a range of permutations to obtain a bag of unordered pixels representing the data, wherein the range of allowed permutations is linearly constrained;
  
  associating a convex cost function with the permutations, wherein the cost function is derived by modeling data as a Gaussian distribution and identified as the determinant of a regularized covariance of the data;
  
  minimizing the convex cost function over the constrained range of permutations to identify a linear subspace of the bag of pixels, wherein the linear subspace corresponds to the most likely invariant ordering of pixels in the set of data S;
  
  performing Principle Component Analysis (PCA) to identify a set of N eigenvectors that span the linear subspace; and
  
  encoding data designated for compression using the set of N eigenvectors to represent the designated data as a set of corresponding N coefficients in the eigenspace.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 wherein minimizing the convex cost function comprises minimizing the convex cost function iteratively using a variational upper bound on the cost function.
  - 3. The method of claim 2 wherein the variational upper bound on the cost function is selected such that the minimization of the cost function is one of a quadratic cost, quadratic assignment problem or a linear assignment problem.
  - 4. The method of claim 2 wherein minimizing the cost function comprises using one of quadratic programming, axis-parallel optimization, Procrustes approaches, singular value decomposition, or Kuhn-Munkres algorithms.
  - 5. The method of claim 1 wherein encoding data designated for compression comprises first invariantly transforming the designated data into the linear subspace and then representing the data as a set of N eigenvalues.
  - 6. The method of claim 5, further comprising reconstructing data in a vector representation from the set of N eigenvalues by linear combination of the eigenvectors.

7. A system for compressing data, comprising:
- a processor configured to access raw input data being represented by vectors x, wherein each vector is an ordered set of pixels, and wherein each pixel comprises an n-tuple, the raw input data including a set of sample data S=(x₁, . . . x_t), wherein each vector x_iis associated with a variable permutation operator that permutes the ordering of the pixels in the vector, wherein the data in set S is a associated with a convex cost function which estimates the cost of permuting the ordering of pixels in S to obtain a bag of unordered pixels representing the data, wherein the range of allowed permutations is linearly constrained, and wherein the cost function is statistically defined as a determinant of a covariance of the data;
  
  the processor being configured for minimizing the convex cost function over the constrained range of allowed permutations to identify a linear subspace of the bag of pixels, wherein the linear subspace corresponds to the most likely invariant ordering of pixels in the set of data S;
  
  a principal components analyzer for identifying a set of N eigenvectors that span the linear subspace; and
  
  an encoder that compresses data by using the set of N eigenvectors as a basis set to encode the data.
- View Dependent Claims (8, 9, 10, 11)
- - 8. The system of claim 7 wherein the processor comprises a convex program for minimizing the cost function over the constrained range of allowed permutations.
  - 9. The system of claim 8 wherein the convex program comprises an iterative program and a variational upper bound on the cost function.
  - 10. The system of claim 9 wherein the variational upper bound is such that the minimization of the cost function maps to one of a quadratic cost, a quadratic assignment problem and a linear assignment problem.
  - 11. The system of claim 9 wherein the iterative program comprises one of a quadratic program, a Procrustes approach program, a singular value decomposition program, an axis-parallel optimization program, or a Kuhn-Munkres algorithm.

12. A system comprising:
- a processor configured to access raw input data being represented by vectors x, wherein each vector is an ordered set of pixels, and wherein each pixel comprises an n-tuple, the raw input data including a set of sample data S=(x₁, . . . x_t), wherein each vector x_iis associated with a variable permutation operator that permutes the ordering of the pixels in the vector, wherein the data in set S is associated with a convex cost function which estimates the cost of permuting the ordering of pixels in S to obtain a bag of unordered pixels representing the data, wherein the range of allowed permutations is linearly constrained, and wherein the cost function is statistically defined as a determinant of a covariance of the data;
  
  the processor being configured for minimizing the convex cost function over the constrained range of allowed permutations to identify a linear subspace of the bag of pixels, wherein the linear subspace corresponds to the most likely invariant ordering of pixels in the set of data S;
  
  a principal components analyzer for identify a set of N of eigenvectors that span the linear subspace; and
  
  an encoder that compresses data by using the set of N eigenvectors as a basis set to encode the data; and
  
  a transmitter for transmitting data compressed by the encoder to a receiver.

13. A system comprising:
- a processor configured to access raw input data being represented by vectors x, wherein each vector is an ordered set of pixels, and wherein each pixel comprises an n-tuple, the raw input data including a set of sample data S=(x₁, . . . x_t), wherein each vector x_iis associated with a variable permutation operator that permutes the ordering of the pixels in the vector, wherein the data in set S is associated with a convex cost function which estimates the cost of permuting the ordering of pixels in S to obtain a bag of unordered pixels representing the data, wherein the range of allowed permutations is linearly constrained, and wherein the cost function is statistically defined as a determinant of a covariance of the data;
  
  the processor being configured form minimizing the convex cost function over the constrained range of allowed permutations to identify a linear subspace of the bag of pixels, wherein the linear subspace corresponds to the most likely invariant ordering of pixels in the set of data S;
  
  a principal components analyzer for identifying a set of N of eigenvectors that span the linear subspace; and
  
  an encoder that compresses data by using the set of N eigenvectors as a basis set to encode the data; and
  
  a receiver having a reconstruction unit for decoding data compressed by the encoder, wherein the reconstruction unit uses the set of N eigenvectors as a basis set to decode compressed data.

14. A method for compressing image data, comprising:
- receiving image data represented by vectors x, wherein each vector is an ordered set of pixels, and wherein each pixel comprises an n-tuple;
  
  permuting the ordering of pixels in a set of data S=(x₁, . . . x_t), over a range of permutations to obtain a bag of unordered pixels representing the data, wherein the range of allowed permutations is linearly constrained;
  
  associating a convex cost function with the permutations, wherein the cost function is derived by modeling data as a Gaussian distribution and identified as the determinant of a regularized covariance of the data;
  
  minimizing, with a processor, the convex cost function over the constrained range of permutations to identify a linear subspace of the bag of pixels, wherein the linear subspace corresponds to the most likely invariant ordering of pixels in the set of data S;
  
  performing PCA, with a principal components analyzer, to identify a set of N eigenvectors that span the linear subspace; and
  
  encoding, with an encoder, data designated for compression using the set of N eigenvectors to represent the designated data as a set of corresponding N coefficients in the eigenspace; and
  
  outputting compressed data responsive to the encoding;
  
  wherein;
  
  (a) the receiving includes receiving data representative of a physical object or system with a data receiver, and the encoding results include compressed data that can be used to reconstruct a representation of the physical object or system;
  
  or(b) the outputting compressed data includes generating an encoded image data signal and transmitting the encoded image data signal on a physical channel.
- View Dependent Claims (15, 16, 17, 18, 19)
- - 15. The method of claim 14, wherein minimizing the convex cost function comprises minimizing the convex cost function iteratively using a variational upper bound on the cost function.
  - 16. The method of claim 14 wherein encoding data designated for compression comprises first invariantly transforming the designated data into the linear subspace and then representing the data as a set of N eigenvector coefficients.
  - 17. The method of claim 16, further comprising reconstructing data in a vector representation from the set of N coefficients by linear combination of the eigenvectors.
  - 18. The method of claim 14, wherein:
    - the receiving includes receiving data representative of a physical object or system with a data receiver and the method further includes reconstructing a representation of the physical object or system.
  - 19. The method of claim 14, wherein the outputting compressed data includes generating an encoded data signal and transmitting the encoded data signal on a data transmitter.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Trustees Of Columbia University In The City Of New York (Columbia University)
Original Assignee
Trustees Of Columbia University In The City Of New York (Columbia University)
Inventors
Jebara, Tony
Primary Examiner(s)
Sparks; Donald
Assistant Examiner(s)
Wong; Lut

Application Number

US11/132,078
Publication Number

US 20050265618A1
Time in Patent Office

1,931 Days
Field of Search

706/12, 382/243
US Class Current

706/12
CPC Class Codes

G06F 18/2135   based on approximation crit...

G06T 9/00   Image coding bandwidth or r...

G06T 9/002   using neural networks

G06T 9/008   Vector quantisation

G06V 10/7715   Feature extraction, e.g. by...

G06V 40/169   Holistic features and repre...

H04N 19/90   using coding techniques not...

Ordered data compression system and methods using principle component analysis

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

49 Citations

19 Claims

Specification

Use Cases

Quick Links

Others

Ordered data compression system and methods using principle component analysis

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

49 Citations

19 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others