EFFICIENT CODING OF DIGITAL MEDIA SPECTRAL DATA USING WIDE-SENSE PERCEPTUAL SIMILARITY

US 20090083046A1
Filed: 11/26/2008
Published: 03/26/2009
Est. Priority Date: 01/23/2004
Status: Active Grant

First Claim

Patent Images

1. An audio encoding method, comprising:

transforming an input audio signal block into a set of spectral coefficients;

dividing the spectral coefficients into plural bands;

coding values of the spectral coefficients of at least one of the bands in an output bit stream; and

for at least one of the other bands, coding the at least one other band in the output bit-stream as a scaled version of a shape of a portion of the at least one of the bands coded as spectral coefficient values, wherein the coding the at least one other band comprises coding the other band using a scale parameter and a shape parameter, wherein the shape parameter comprises a motion vector and indicates the portion of the at least one of the bands coded as spectral coefficient values, and wherein the scale parameter is a scaling factor to scale the portion.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Traditional audio encoders may conserve coding bit-rate by encoding fewer than all spectral coefficients, which can produce a blurry low-pass sound in the reconstruction. An audio encoder using wide-sense perceptual similarity improves the quality by encoding a perceptually similar version of the omitted spectral coefficients, represented as a scaled version of already coded spectrum. The omitted spectral coefficients are divided into a number of sub-bands. The sub-bands are encoded as two parameters: a scale factor, which may represent the energy in the band; and a shape parameter, which may represent a shape of the band. The shape parameter may be in the form of a motion vector pointing to a portion of the already coded spectrum, an index to a spectral shape in a fixed code-book, or a random noise vector. The encoding thus efficiently represents a scaled version of a similarly shaped portion of spectrum to be copied at decoding.

Citations

20 Claims

1. An audio encoding method, comprising:
- transforming an input audio signal block into a set of spectral coefficients;
  
  dividing the spectral coefficients into plural bands;
  
  coding values of the spectral coefficients of at least one of the bands in an output bit stream; and
  
  for at least one of the other bands, coding the at least one other band in the output bit-stream as a scaled version of a shape of a portion of the at least one of the bands coded as spectral coefficient values, wherein the coding the at least one other band comprises coding the other band using a scale parameter and a shape parameter, wherein the shape parameter comprises a motion vector and indicates the portion of the at least one of the bands coded as spectral coefficient values, and wherein the scale parameter is a scaling factor to scale the portion.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The audio encoding method of claim 1, wherein the scaling factor represents a total energy for the at least one of the other bands.
  - 3. The audio encoding method of claim 1, wherein the scaling factor is coded as coefficients characterizing a polynomial relation that yields scaling factors of two or more of the other bands as a function of frequency.
  - 4. The audio encoding method of claim 1, wherein the scaling factor is a root-mean-square value of coefficients within the other band.
  - 5. The audio encoding method of claim 1, wherein the shape parameter further comprises values representing shift of the portion.
  - 6. The audio encoding method of claim 1, wherein the shape parameter further comprises values representing stretch of the portion.
  - 7. The audio encoding method of claim 1, wherein the motion vector indicates a normalized version of the portion.
  - 8. The audio encoding method of claim 1, wherein the coding the other band comprises coding the other band as a filter having a frequency response and excitation.
  - 9. The audio encoding method of claim 8, wherein the filter having the frequency response is a linear predictive coding filter.
  - 10. The audio encoding method of claim 1, wherein the shape parameter further comprises a vector for a spectral shape from a codebook.
  - 11. The audio encoding method of claim 1, further comprising:
    - selecting the portion of the at least one of the bands coded as spectral coefficient values by performing a least-means-square comparison of a normalized version of the at least one of the other bands; and
      
      storing an indication of the selected portion in the motion vector.

12. One or more computer-readable storage media comprising instructions configurable to cause a computer to perform an audio decoding method for an encoded audio bitstream, the method comprising:
- decoding one or more baseband spectral coefficients from the encoded audio bitstream;
  
  decoding one or more extended band spectral coefficients by;
  
  copying one or more identified baseband spectral coefficients according to a shape parameter, wherein the shape parameter comprises a motion vector identifying one or more baseband spectral coefficients to be copied; and
  
  scaling the copied one or more identified baseband spectral coefficients according to a scale parameter.
- View Dependent Claims (13, 14, 15, 16, 17)
- - 13. The one or more computer-readable storage media of claim 12, wherein the shape parameter further comprises a vector for a spectral shape in a codebook, and wherein the decoding one or more extended band spectral coefficients further comprises copying the spectral shape from the codebook.
  - 14. The one or more computer-readable storage media of claim 12, wherein the scale parameter comprises a scaling factor representing a total energy of a band of spectral coefficients from which the encoded audio bitstream was encoded.
  - 15. The one or more computer-readable storage media of claim 12, wherein the scale parameter comprises a scaling factor, the scaling factor being a root-mean-square value of spectral coefficients from which the encoded audio bitstream was encoded.
  - 16. The one or more computer-readable storage media of claim 12, the method further comprising performing an inverse transform operation to transform the decoded one or more baseband spectral coefficients and the decoded one or more extended band spectral coefficients into a reproduction of an input audio signal block.
  - 17. The one or more computer-readable storage media of claim 12, wherein the scale parameter comprises coefficients characterizing a polynomial relation that yields scaling factors for a plurality of extended band spectral coefficients as a function of frequency.

18. A computing device comprising:
- a processing unit;
  
  one or more computer-readable storage media comprising instructions configured to cause the processing unit to perform an audio decoding method for an encoded audio bitstream, the method comprising;
  
  decoding one or more baseband spectral coefficients from the encoded audio bitstream;
  
  decoding a first band of extended spectral coefficients from the encoded audio bitstream by;
  
  decoding, from the encoded audio bitstream, a scale factor for the first band;
  
  copying one or more identified baseband spectral coefficients according to a first shape parameter, wherein the shape parameter comprises a motion vector identifying one or more baseband spectral coefficients to be copied, the identified one or more baseband spectral coefficients describing a shape of a spectral band; and
  
  scaling the copied one or more identified baseband spectral coefficients according to the decoded scale factor for the first band;
  
  decoding a second band of extended spectral coefficients from the encoded audio bitstream by;
  
  decoding, from the encoded audio bitstream, a scale factor for the second band;
  
  copying one or more vectors from a codebook according to a second shape parameter; and
  
  scaling the copied one or more vectors from the codebook according to the decoded scale factor for the second band; and
  
  performing an inverse transform on the decoded one or more baseband spectral coefficients and the decoded one or more extended band spectral coefficients to make a reconstructed audio signal.
- View Dependent Claims (19, 20)
- - 19. The computing device of claim 18, wherein the decoded scale factor for the first band comprises a root-mean-square value of spectral coefficients from which the encoded audio bitstream was encoded.
  - 20. The computing device of claim 18, wherein the first shape parameter further comprises values representing a stretch of the shape of the spectral band.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Chen, Wei-Ge, Mehrotra, Sanjeev

Granted Patent

US 8,645,127 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/500
CPC Class Codes

G10L 19/0204   using subband decomposition

G10L 19/0208   Subband vocoders

G10L 19/035   Scalar quantisation

EFFICIENT CODING OF DIGITAL MEDIA SPECTRAL DATA USING WIDE-SENSE PERCEPTUAL SIMILARITY

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

EFFICIENT CODING OF DIGITAL MEDIA SPECTRAL DATA USING WIDE-SENSE PERCEPTUAL SIMILARITY

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links