Zero-search, zero-memory vector quantization

US 20050004795A1
Filed: 06/25/2004
Published: 01/06/2005
Est. Priority Date: 06/26/2003
Status: Active Grant

First Claim

Patent Images

1. A method for data compression comprising the steps of:

establishing an implicit codebook comprising an implicitly defined set of vectors, hereafter called code points, which are symmetrically placed with respect to the origin;

said code points implicitly represented by a hypercube radius vector {overscore (α

)}=<

α

₁,α

₂,K,α

_D>

, wherein said code points are used for representing information elements;

said information elements constituting data to be compressed, and said information elements also being vectors; and

computing a compression function for said information elements by;

inspecting the signs of said information elements to determine in which orthant said information element lies, thereby determining the implicit code point of the implicit codebook to represent said information element; and

determining an index of the associated implicit code point so selected for said information element.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention comprises a method for lossy data compression, akin to vector quantization, in which there is no explicit codebook and no search, i.e. the codebook memory and associated search computation are eliminated. Some memory and computation are still required, but these are dramatically reduced, compared to systems that do not exploit this method. For this reason, both the memory and computation requirements of the method are exponentially smaller than comparable methods that do not exploit the invention. Because there is no explicit codebook to be stored or searched, no such codebook need be generated either. This makes the method well suited to adaptive coding schemes, where the compression system adapts to the statistics of the data presented for processing: both the complexity of the algorithm executed for adaptation, and the amount of data transmitted to synchronize the sender and receiver, are exponentially smaller than comparable existing methods.

Citations

37 Claims

1. A method for data compression comprising the steps of:
- establishing an implicit codebook comprising an implicitly defined set of vectors, hereafter called code points, which are symmetrically placed with respect to the origin;
  
  said code points implicitly represented by a hypercube radius vector {overscore (α
  
  )}=<
  
  α
  
  ₁,α
  
  ₂,K,α
  
  _D>
  
  , wherein said code points are used for representing information elements;
  
  said information elements constituting data to be compressed, and said information elements also being vectors; and
  
  computing a compression function for said information elements by;
  
  inspecting the signs of said information elements to determine in which orthant said information element lies, thereby determining the implicit code point of the implicit codebook to represent said information element; and
  
  determining an index of the associated implicit code point so selected for said information element.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 31)
- - 2. The method of claim 1, further comprising the steps of:
    - compressing digitized human speech;
      
      transmitting said compressed, digitized human speech through a communication channel;
      
      receiving said compressed, digitized human speech via said communication channel;
      
      decompressing said compressed, digitized human speech; and
      
      processing said digitized human speech with an automatic speech recognition system.
  - 3. The method of claim 1, further comprising the step of:
    - applying an invertible function to information elements to be compressed, said function determined so that an aggregate of typical data to be compressed will approximate a uniform distribution within a D-dimensional sphere, centered at the origin.
  - 4. The method of claim 3, further comprising the step of:
    - finding a symmetrizing transform.
  - 5. The method of claim 1, further comprising the step of:
    - finding an optimal D-hypercube codebook.
  - 6. The method of claim 3, further comprising the step of:
    - finding an optimal D-hypercube codebook, with respect to typical transformed data.
  - 7. The method of claim 6, said symmetrizing transform further comprising a whitening transform.
  - 8. The method of claim 7, further comprising the steps of:
    - projecting all data into a D-1-dimensional space;
      
      processing a resulting data set; and
      
      wherein only D-1 bits are transmitted for each vector processed.
  - 9. The method of claim 7, further comprising the step of:
    - representing the whitening transform in a memory efficient manner as the product of a diagonal matrix, said matrix represented by only its D non-zero elements, and an orthogonal matrix;
      
      the inverse of this transform therefore obtainable as the product, in the opposite order, of a diagonal matrix of D reciprocals of the original D non-zero elements, and the transpose of the original orthogonal matrix.
  - 10. The method of claim 6, further comprising the steps of:
    - determining an optimal hypercube radius vector {overscore (α
      
      )}=<
      
      α
      
      ₁,α
      
      ₂,K,α
      
      _D>
      
      , this vector defining said hypercube codebook, that yields minimal mean square coding distortion among all D-hypercube codebooks, for typical data to be compressed.
  - 11. The method of claim 10, further comprising the step of:
    - computing an optimal hypercube radius vector {overscore (α
      
      )}=<
      
      α
      
      ₁,α
      
      ₂,K,α
      
      _D>
      
      , from an example data collection ε
      
      comprised of E=|ε
      
      | typical data vectors ν
      
      , by computing for each dimension i=1, . . . , D the quantity $α_{i} = \frac{1}{E_{v}} \langle v_{i} \rangle$ where the sum is taken to run over every vector ν
      
      in ε
      
      .
  - 12. The method of claim 6, further comprising the step of:
    - determining an optimal hypercube radius vector {overscore (α
      
      )}=<
      
      α
      
      ₁,α
      
      ₂,K,α
      
      _D>
      
      , this vector defining said hypercube codebook, that yields minimal mean square coding distortion among all D-hypercube codebooks, with respect to typical transformed data.
  - 13. The method of claim 12, further comprising the step of:
    - computing an optimal hypercube radius vector {overscore (α
      
      )}=<
      
      α
      
      ₁,α
      
      ₂,K,α
      
      _D>
      
      , from an example data collection ε
      
      comprised of E=|ε
      
      | typical data vectors ν
      
      , to which has been applied the symmetrizing function of claim 3, to yield a collection U comprised of U=|U|=|ε
      
      |=E typical data vectors u, by computing for each dimension i=1, . . . , D the quantity $α_{i} = \frac{1}{U_{u}} \langle u_{i} \rangle$ where the sum is taken to run over every vector u in U.
  - 14. The method of claim 1 for compressing a vector ν
    - =<
      
      ν
      
      ₁,ν
      
      ₂, . . . , ν
      
      _D), further comprising the steps of;
      
      obtaining a vector ν
      
      =<
      
      ν
      
      ₁,ν
      
      ₂, . . . , ν
      
      _D) for compression;
      
      forming a D-bit binary number i as a bitwise concatenation i=m(v_D) m(v_D−
      
      1) . . . m(v₂) m(v₁);
      
      where the jth bit of i is 0 if v_jis zero or positive, and 1 if it is negative; and
      
      transmitting i.
  - 15. The method of claim 4 for compressing a vector v, further comprising the steps of:
    - obtaining a vector v for compression;
      
      computing u=Tv, where u is denoted <
      
      u₁,u₂, . . . ,u_D>
      
      ;
      
      where T is the symmetrizing transform;
      
      forming a D-bit binary number i as a bitwise concatenation i=m(u_D) m(u_D−
      
      1) . . . m(u₂) m(u₁);
      
      where the jth bit of i is 0 if u_jis zero or positive, and 1 if it is negative; and
      
      transmitting i.
  - 16. The method of claim 1 for decompressing an index i, obtained via compression with respect to the hypercube radius vector {overscore (α
    - )}=<
      
      α
      
      ₁,α
      
      ₂,K,α
      
      _D>
      
      , further comprising the steps of;
      
      obtaining an index i for decompression;
      
      setting $\begin{matrix} {\tilde{u}}_{1} = b_{0} (i, α_{1}) \\ {\tilde{u}}_{2} = b_{1} (i, α_{2}) \\ ⋮ \\ {\tilde{u}}_{D} = b_{D - 1} (i, α_{D}) \end{matrix}$ where each ũ
      
      _jis either +α
      
      _jor −
      
      α
      
      _jdepending as the j th bit of i is 0 or 1;
      
      and returning ũ
      
      , the vector comprised of elements ũ
      
      ₁,ũ
      
      ₂,K,ũ
      
      _Dcomputed as above.
  - 17. The method of claim 4 for decompressing an index i, obtained via compression with respect to the hypercube radius vector {overscore (α
    - )}=<
      
      α
      
      ₁,α
      
      ₂,K,α
      
      _D>
      
      , further comprising the steps of;
      
      obtaining an index i for decompression;
      
      setting $\begin{matrix} {\tilde{u}}_{1} = b_{0} (i, α_{1}) \\ {\tilde{u}}_{2} = b_{1} (i, α_{2}) \\ ⋮ \\ {\tilde{u}}_{D} = b_{D - 1} (i, α_{D}) \end{matrix}$ where each ũ
      
      _jis either +α
      
      _jor −
      
      α
      
      _jdepending as the j th bit of i is 0 or 1;
      
      computing {tilde over (υ
      
      )}=T^−
      
      1ũ
      
      ; and
      
      returning {tilde over (υ
      
      )}.
  - 18. The compression method of claim 3, further comprising the step of:
    - incorporating a rotation in the symmetrizing transform, or equivalently rotating the hypercube codebook, to lower distortion.
  - 19. The method of claim 18, further comprising the step of:
    - finding an optimal hypercube radius vector for the rotated hypercube codebook.
  - 20. The compression method of claim 1, further comprising the steps of:
    - increasing a number of implicit code points by increasing the number of hypercubes, wherein compression occurs with respect to a family of hypercube codebooks.
  - 21. The compression method of claim 20, using a family of hypercubes A, each hypercube determined by its associated hypercube radius vector {overscore (α
    - )}=<
      
      α
      
      ₁,α
      
      ₂,K,α
      
      _D>
      
      , further comprising the steps of;
      
      applying a symmetrizing transform T, obtaining u=Tv, said symmetrizing transform comprising the steps of;
      
      given vector v to compress find u=Tv;
      
      finding the orthant of u, encoded as i=m(u_D)m(u_D−
      
      1) . . . m(u₁);
      
      finding, via explicit search within the orthant, a hypercube index k of the closest hypercube {overscore (α
      
      )}^k∈
      
      A; and
      
      transmitting a result of said search, in the form of the said hypercube index k so determined, along with the identity i of the orthant, to a receiver.
  - 22. The method of claim 21, using a multiplicity of hypercubes A;
    - varying with respect to one another in hypercube radius vector, in orientation, or both;
      
      said orientations being expressed by a rotation R^kassociated to each hypercube radius vector α
      
      ^k, said rotation possibly being the identity; and
      
      further comprising the steps of;
      
      given vector v to compress finding each u^k=R^kTv, finding the orthant index i of u^kfinding, via explicit search within the associated orthant, a hypercube index k of the closest rotated hypercube transmitting a result of said search, in the form of the said hypercube index k so determined, along with the identity i of the orthant, to a receiver.
  - 23. The method of claim 21, further comprising the steps of:
    - decompressing the pair comprised of hypercube index k and orthant index i, by using hypercube index k to select an appropriate hypercube radius vector {overscore (α
      
      )}^k; and
      
      inspecting the coded orthant i to yield an appropriate vertex of the {overscore (α
      
      )}^khypercube;
      
      wherein said vertex is taken as ũ
      
      , from which {tilde over (υ
      
      )}=T⁻ũ
      
      is computed, and the value {tilde over (ν
      
      )} returned as the result
  - 24. The method of claim 22, further comprising the steps of:
    - decompressing the pair comprised of hypercube index k and orthant index i;
      
      by using hypercube index k to select an appropriate hypercube radius vector {overscore (α
      
      )}^k; and
      
      inspecting the coded orthant i to yield an appropriate vertex of the {overscore (α
      
      )}^khypercube;
      
      wherein said vertex is taken as ũ
      
      , from which {tilde over (ν
      
      )}=T^−
      
      1(R^k)^−
      
      1ũ
      
      is computed, and the value {tilde over (ν
      
      )} returned as the result.
  - 31. The method of claim 29, further comprising the steps of:
    - rotating said hypercube codebook, or equivalently rotating the data to be compressed, to lower distortion and then compressing the transformed vectors by the method of claim 24.

25. A method to find a collection A of perfect hypercube codebooks, comprising the steps of:
- applying an orthant K-means algorithm to find a collection of K perfect hypercube codebooks that yield low average coding distortion for transformed example data.

26. A compression method, comprising the steps of:
- obtaining a vector v for compression;
  
  computing u=Tv, where u is denoted by <
  
  u₁,K,u_D);
  
  forming {overscore (ζ
  
  )}(u)=<
  
  ζ
  
  (u₁), . . . , ζ
  
  (u_D)>
  
  ;
  
  finding k=argmin_j∥
  
  u−
  
  {overscore (α
  
  )}^j⊙
  
  {overscore (ζ
  
  )}(u)∥
  
  , where {overscore (α
  
  )}^jis drawn from a set of hypercube radius vectors A={{overscore (α
  
  )}¹, {overscore (α
  
  )}², . . . , {overscore (α
  
  )}^K;
  
  wherein {overscore (α
  
  )}^j⊙
  
  {overscore (ζ
  
  )}(u) is the element of k({overscore (α
  
  )}^j) that lies in the same orthant as u, and wherein k is the index of a hypercube codebook that has a vertex closest to u;
  
  forming the D-bit binary number i as the bitwise concatenation i=m(u_D)m(u_D=1) . . . m(u₂)m(u₁), where the j th bit of i is 0 if u_jis zero or positive, and is 1 if it is negative; and
  
  transmitting the pair <
  
  k−
  
  1,i>
  
  .

27. A decompression method, comprising the steps of:
- obtaining the pair <
  
  k−
  
  1,i>
  
  for decompression;
  
  selecting {overscore (α
  
  )}^k=<
  
  {overscore (α
  
  )}₁^k, . . . , {overscore (α
  
  )}_D^k>
  
  from the set of hypercube radius vectors A={{overscore (α
  
  )}¹, {overscore (α
  
  )}², . . . , {overscore (α
  
  )}^K};
  
  setting $\begin{matrix} {\tilde{u}}_{1} = b_{0} (i, α_{1}^{k}) \\ {\tilde{u}}_{2} = b_{1} (i, α_{2}^{k}) \\ ⋮ \\ {\tilde{u}}_{D} = b_{D - 1} (i, α_{D}^{k}), \end{matrix}$ where is ũ
  
  _jeither +α
  
  _j^kor −
  
  α
  
  _j^k, depending as whether the j th bit of i is 0 or 1;
  
  computing {tilde over (υ
  
  )}_j=T^−
  
  1{overscore (u)}; and
  
  returning {tilde over (υ
  
  )}.

28. A method for finding a family A of K hypercube codebooks, comprising the steps of:
- beginning with a fixed number K of desired hypercubes, and an example dataset U;
  
  mapping each element u∈
  
  u, where u=<
  
  ₁,K,u_D>
  
  , to the positive orthant ^D, via the map p;
  
  <
  
  u₁,K,u_D>
  
  →
  
  <
  
  u₁∥
  
  ,∥
  
  u₂∥
  
  ,K,∥
  
  u_D∥
  
  >
  
  , yielding the set u⁺={p(u)|u∈
  
  u};
  
  selecting an initial set of K radius vectors A⁽⁰⁾={{overscore (α
  
  )}₀⁽⁰⁾. . . {overscore (α
  
  )}_K−
  
  1⁽⁰⁾};
  
  setting an iteration count i to 0;
  
  establishing a termination condition τ
  
  which depends upon one or more of;
  
  the number of iterations executed;
  
  the closeness of match between a current radius vector collection A⁽ⁱ⁾and u⁺; and
  
  the improvement of a statistic over a previous iteration, wherein said dependence is expressed as τ
  
  (i, A⁽ⁱ⁾, u⁺);
  
  testing τ
  
  (i, A⁽ⁱ⁾, u⁺); and
  
  if the termination condition is satisfied, returning A⁽ⁱ⁾as the desired radius vector collection; and
  
  stopping;
  
  else, if the termination condition is not satisfied, computing a new radius vector collection A⁽ⁱ⁺¹⁾as follows;
  
  partitioning u⁺ into K sets S₀. . . S_K−
  
  1, where $S_{j} = {υ \in u^{+} | \underset{k}{\arg \min}  υ - {\overline{α}}_{k}^{(i)}  = j}$ where S_jcomprises all the vectors v in u⁺ that are closer to {overscore (α
  
  )}_j⁽ⁱ⁾than any other element of A⁽ⁱ⁾;
  
  setting {overscore (α
  
  )}_j⁽ⁱ⁺¹⁾, the j th entry of the new radius vector collection A⁽ⁱ⁺¹⁾, to the mean of the vectors in S_j, which is in symbols ${\overline{α}}_{j}^{(i + 1)} = \frac{1}{\langle S_{j} \rangle} \sum_{v \in S_{j}} v; and setting A^{(i + 1)} =^{{{\overline{α}}_{0}^{(i + 1)} \dots {\overline{α}}_{K - 1}^{(i + 1)}}};$ incrementing the iteration count i; and
  
  returning to said testing step.

29. A method for data compression, comprising the steps of:
- computing a compression function for information elements by;
  
  inspecting the signs of said information elements to determine in which quadrant of an implicit codebook a corresponding implicit code point lies; and
  
  determining an index of an associated implicit code point for said information element.
- View Dependent Claims (30, 32, 33, 34, 35, 36, 37)
- - 30. The method of claim 29, further comprising the step of:
    - compressing vectors arising from a data source by first applying a symmetrizing transform, and then compressing the transformed vectors.
  - 32. The method of claim 30, further comprising the step of incorporating a rotation in the symmetrizing transform, or equivalently rotating the hypercube codebook, to lower distortion, and then compressing.
  - 33. The method of claim 29, further comprising the step of:
    - establishing an implicit codebook comprising an implicitly defined set of vectors, hereafter called code points, which are symmetrically placed with respect to the origin;
      
      wherein said code points are used for representing information elements.
  - 34. The method of claim 29, further comprising the step of:
    - increasing a number of code points by increasing the number of hypercubes, wherein compression occurs with respect to a family of hypercube codebooks, and the preferred hypercube is found by explicit search, once the orthant of the vector to be compressed has been determined
  - 35. The method of claim 30, further comprising the step of:
    - increasing a number of code points by increasing the number of hypercubes, wherein compression occurs with respect to a family of hypercube codebooks, and the preferred hypercube is found by explicit search, once the orthant of the vector to be compressed has been determined
  - 36. The method of claim 29, further comprising the step of:
    - increasing a number of code points by increasing the number of hypercubes, wherein compression occurs with respect to a family A of hypercube codebooks, the hypercubes varying with respect to one another in hypercube radius vector, in orientation, or both;
      
      the selected hypercube and orthant index being found by explicit search among a set consisting of the preferred vertex of each hypercube, the preferred vertex of each hypercube being the one that lies in the same orthant of the vector to be compressed, for each hypercube
  - 37. The method of claim 30, further comprising the step of:
    - increasing a number of code points by increasing the number of hypercubes, wherein compression occurs with respect to a family A of hypercube codebooks, the hypercubes varying with respect to one another in hypercube radius vector, in orientation, or both;
      
      the selected hypercube and orthant index being found by explicit search among a set consisting of the preferred vertex of each hypercube, the preferred vertex of each hypercube being the one that lies in the same orthant of the vector to be compressed, for each hypercube

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Promptu Systems Corporation
Original Assignee
Promptu Systems Corporation
Inventors
Printz, Harry

Granted Patent

US 7,729,910 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/222
CPC Class Codes

G06Q 30/02   Marketing; Price estimation...

G10L 15/26   Speech to text systems G10L...

G10L 17/00   Speaker identification or v...

G10L 19/00   Speech or audio signals ana...

G10L 25/24   the extracted parameters be...

H03M 7/3082   Vector coding for televisio...

H04N 19/94   Vector quantisation

Zero-search, zero-memory vector quantization

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

37 Claims

Specification

Solutions

Use Cases

Quick Links

Zero-search, zero-memory vector quantization

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

37 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links