Coding of sparse digital media spectral data

US 7,774,205 B2
Filed: 06/15/2007
Issued: 08/10/2010
Est. Priority Date: 06/15/2007
Status: Active Grant

First Claim

Patent Images

1. A method of compressively encoding audio signal data containing a time series of audio signal samples as a compressed data stream, the method comprising:

transforming successive blocks of the audio signal data into sets of spectral coefficients;

quantizing the spectral coefficients;

for at least a portion of the spectral coefficients in the sets, detecting any spectral peaks out of the spectral coefficients in the portion;

correlating spectral peaks detected out of the set of spectral coefficients for a current block to spectral peaks detected out of the spectral coefficients for a preceding block of the audio signal data; and

encoding information to represent those of the spectral peaks for the current block that correlate to spectral peaks for the preceding block in the compressed data stream using temporal prediction coding and encoding information to represent at least some of the spectral peaks in the compressed data stream using at least one three value combination of a length of a run of zero-valued spectral coefficients and levels of two spectral coefficients following the run.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An audio encoder/decoder provides efficient compression of spectral transform coefficient data characterized by sparse spectral peaks. The audio encoder/decoder applies a temporal prediction of the frequency position of spectral peaks. The spectral peaks in the transform coefficients that are predicted from those in a preceding transform coding block are encoded as a shift in frequency position from the previous transform coding block and two non-zero coefficient levels. The prediction may avoid coding very large zero-level transform coefficient runs as compared to conventional run length coding. For spectral peaks not predicted from those in a preceding transform coding block, the spectral peaks are encoded as a value trio of a length of a run of zero-level spectral transform coefficients, and two non-zero coefficient levels.

Citations

17 Claims

1. A method of compressively encoding audio signal data containing a time series of audio signal samples as a compressed data stream, the method comprising:
- transforming successive blocks of the audio signal data into sets of spectral coefficients;
  
  quantizing the spectral coefficients;
  
  for at least a portion of the spectral coefficients in the sets, detecting any spectral peaks out of the spectral coefficients in the portion;
  
  correlating spectral peaks detected out of the set of spectral coefficients for a current block to spectral peaks detected out of the spectral coefficients for a preceding block of the audio signal data; and
  
  encoding information to represent those of the spectral peaks for the current block that correlate to spectral peaks for the preceding block in the compressed data stream using temporal prediction coding and encoding information to represent at least some of the spectral peaks in the compressed data stream using at least one three value combination of a length of a run of zero-valued spectral coefficients and levels of two spectral coefficients following the run.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1 wherein said encoding using a three value combination comprises encoding the information using a joint or separate entropy code that represents the three value combination.
  - 3. The method of claim 1 wherein said encoding using temporal prediction coding comprises using a code that represents a shift in position of a current block spectral peak from that of a preceding block spectral block to which the current block spectral peak correlates.
  - 4. The method of claim 1 wherein said encoding using temporal prediction coding comprises using a code that represents a combination of a shift in position of a current block spectral peak from that of a preceding block spectral peak to which the current block spectral peak correlates, and two peak coefficient levels.
  - 5. A method of decoding the compressed data stream encoded according to the method of claim 4, the method of decoding comprising:
    - reading information representing spectral peaks from the compressed data stream;
      
      for the spectral peak information encoded using at least one three value combination, decoding the three value combination code to determine spectral coefficients for the spectral peak from the values of zero-run length and levels;
      
      for the spectral peak information encoded using temporal prediction coding, decoding the combination code to determine spectral coefficients for the spectral peak from the value of the shift and the peak coefficient levels;
      
      de-quantizing the spectral coefficients; and
      
      inverse transforming the spectral coefficients to reconstruct the time series of audio signal samples.

6. An audio data processor, comprising:
- an input for receiving an audio data stream containing a time series of audio signal samples;
  
  a time-frequency transform for transforming successive blocks of the audio signal samples to produce sets of spectral coefficients;
  
  a spectral peak encoder operating to detect spectral peaks in at least a portion of the spectral coefficient sets, and operating to encode individual ones of the detected spectral peaks using one of a temporal prediction coding and a zero run coding, wherein the spectral peak encoder operates to correlate the detected spectral peaks in the portion of successive spectral coefficient sets to those in the portion of their preceding spectral coefficient sets, and to encode the detected spectral peaks that correlate to spectral peaks in preceding spectral coefficient sets using the temporal prediction coding and otherwise to encode the detected spectral peaks using the zero run coding.
- View Dependent Claims (7, 8, 9, 10, 16)
- - 7. The audio data processor of claim 6 wherein the temporal prediction coding encodes a detected spectral peak as a position shift from a correlated spectral peak in the preceding spectral coefficient set.
  - 8. The audio data processor of claim 6, wherein the zero run coding encodes a detected spectral peak as at least one multi-value combination comprising a length of a run of zero-valued spectral coefficients preceding the detected spectral peak, and levels of a pair of spectral coefficients following the run.
  - 9. The audio data processor of claim 8, wherein the zero run coding further comprises a joint entropy encoding of the at least one multi-value combination.
  - 10. The audio data processor of claim 8, wherein the temporal prediction coding further operates to encode a code indicating absence among the detected spectral peaks of a spectral peak correlating to a spectral peak in a preceding spectral coefficient set.
  - 16. The audio data processor of claim 6, further comprising a decoder configured to read information representing spectral peaks from the compressed data stream, and for the spectral peak information encoded using at least one three value combination, decoding the three value combination code to determine spectral coefficients for the spectral peak from the values of zero-run length and levels, and for the spectral peak information encoded using temporal prediction coding, decoding the combination code to determine spectral coefficients for the spectral peak from the value of the shift and the peak coefficient levels, de-quantizing the spectral coefficients;
    - and inverse transforming the spectral coefficients to reconstruct the time series of audio signal samples.

11. A computer-readable data storage device having instructions carried thereon, the instructions being executable by an audio data processor to perform a method of compressing an audio data stream, the method comprising:
- transforming successive blocks of a time sample audio data stream into sets of spectral coefficients;
  
  quantizing the spectral coefficients;
  
  encoding the spectral coefficients into a compressed audio data stream, wherein said encoding for at least a portion of the spectral coefficients of a set comprises;
  
  identifying spectral peaks among the spectral coefficients of the portion;
  
  correlating the identified spectral peaks of the set to spectral peak of a preceding set;
  
  encoding those of the identified spectral peaks of the set that correlate to spectral peaks of the preceding set using a temporal prediction coding; and
  
  encoding those of the identified spectral peaks of the set that lack correlation to spectral peaks of the preceding set using a zero run length coding.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The computer-readable data storage device of claim 11 wherein encoding using the temporal prediction coding comprises:
    - encoding one of the identified spectral peaks that correlates to a spectral peak of the preceding set using a coded value representing a shift in position from the correlated spectral peak of the preceding set.
  - 13. The computer-readable data storage device of claim 12 wherein encoding using the temporal prediction coding further comprises:
    - in a case that no identified spectral peak correlates to a spectral peak of the preceding set, encoding a value indicative of a died out spectral peak for a location of the spectral peak of the preceding set.
  - 14. The computer-readable data storage device of claim 11 wherein encoding using the zero run length coding comprises:
    - encoding one of the identified spectral peaks that lacks correlation to the spectral peaks of the preceding set using a coded value combination of a run length of zero-level spectral coefficients and levels of two spectral coefficients.
  - 15. The computer-readable data storage device of claim 14 wherein encoding using the zero run length coding comprises:
    - encoding said one of the identified spectral peaks as a joint or separate entropy code representing the coded value combination.

17. A method of decoding, comprising:
- receiving a compressed audio data stream produced by the method including;
  
  transforming successive blocks of the audio signal data into sets of spectral coefficients;
  
  quantizing the spectral coefficients;
  
  for at least a portion of the spectral coefficients in the sets, detecting any spectral peaks out of the spectral coefficients in the portion;
  
  correlating spectral peaks detected out of the set of spectral coefficients for a current block to spectral peaks detected out of the spectral coefficients for a preceding block of the audio signal data; and
  
  encoding information to represent those of the spectral peaks for the current block that correlate to spectral peaks for the preceding block in the compressed data stream using temporal prediction coding and encoding information to represent at least some of the spectral peaks in the compressed data stream using at least one three value combination of a length of a run of zero-valued spectral coefficients and levels of two spectral coefficients following the run;
  
  reading information representing spectral peaks from the compressed data stream;
  
  for the spectral peak information encoded using at least one three value combination, decoding the three value combination code to determine spectral coefficients for the spectral peak from the values of zero-run length and levels;
  
  for the spectral peak information encoded using temporal prediction coding, decoding the combination code to determine spectral coefficients for the spectral peak from the value of the shift and the peak coefficient levels;
  
  de-quantizing the spectral coefficients; and
  
  inverse transforming the spectral coefficients to reconstruct the time series of audio signal samples.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Chen, Wei-Ge, Koishida, Kazuhito, Mehrotra, Sanjeev
Primary Examiner(s)
AZAD, ABUL K

Application Number

US11/764,108
Publication Number

US 20080312758A1
Time in Patent Office

1,152 Days
Field of Search

704/222, 704/230, 704/503
US Class Current

704/503
CPC Class Codes

G10L 19/02   using spectral analysis, e....

G10L 19/0212   using orthogonal transforma...

G10L 19/032   Quantisation or dequantisat...

G10L 19/18   Vocoders using multiple modes

Coding of sparse digital media spectral data

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Coding of sparse digital media spectral data

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links