Audio encoding method and apparatus

US 10,347,267 B2
Filed: 08/21/2017
Issued: 07/09/2019
Est. Priority Date: 06/24/2014
Status: Active Grant

First Claim

Patent Images

1. An audio encoding method, comprising:

dividing an energy spectrum of each of N audio frames into P fast Fourier transform (FFT) energy spectrum coefficients, wherein P and N are positive integers, and the N audio frames comprise a current audio frame;

determining a general sparseness parameter according to energy of the P FFT energy spectrum coefficients of each of the N audio frames by determining an average value of minimum bandwidths of distribution on spectrums of a first preset proportion of energy of the N audio frames according to the energy of the P FFT energy spectrum coefficients of each of the N audio frames, wherein the general sparseness parameter comprises a first minimum bandwidth, wherein the average value of the minimum bandwidths of the distribution on spectrums of the first preset proportion of the energy of the N audio frames is used as the first minimum bandwidth, and wherein the general sparseness parameter indicates sparseness of distribution in energy spectrums of the N audio frames; and

determining, based on the sparseness of distribution, whether to use a first encoding method or a second encoding method to encode the current audio frame, wherein the first encoding method is based on time-frequency transform and transform coefficient quantization, and the second encoding method is a linear-predication-based encoding method, andwherein the first encoding method is determined to be used to encode the current audio frame based on a condition that the first minimum bandwidth is less than a first preset value, or the second encoding method is determined to be used to encode the current audio frame based on a condition that the first minimum bandwidth is greater than the first preset value.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An audio encoding method and an apparatus are provided. The method includes: determining sparseness of distribution, on spectrums, of energy of N input audio frames (101), where the N audio frames include a current audio frame, and N is a positive integer; and determining, according to the sparseness of distribution, on the spectrums, of the energy of the N audio frames, whether to use a first encoding method or a second encoding method to encode the current audio frame (102), where the first encoding method is an encoding method that is based on time-frequency transform and transform coefficient quantization and that is not based on linear prediction, and the second encoding method is a linear-predication-based encoding method. The method can reduce encoding complexity and ensure that encoding is of relatively high accuracy.

Citations

20 Claims

1. An audio encoding method, comprising:
- dividing an energy spectrum of each of N audio frames into P fast Fourier transform (FFT) energy spectrum coefficients, wherein P and N are positive integers, and the N audio frames comprise a current audio frame;
  
  determining a general sparseness parameter according to energy of the P FFT energy spectrum coefficients of each of the N audio frames by determining an average value of minimum bandwidths of distribution on spectrums of a first preset proportion of energy of the N audio frames according to the energy of the P FFT energy spectrum coefficients of each of the N audio frames, wherein the general sparseness parameter comprises a first minimum bandwidth, wherein the average value of the minimum bandwidths of the distribution on spectrums of the first preset proportion of the energy of the N audio frames is used as the first minimum bandwidth, and wherein the general sparseness parameter indicates sparseness of distribution in energy spectrums of the N audio frames; and
  
  determining, based on the sparseness of distribution, whether to use a first encoding method or a second encoding method to encode the current audio frame, wherein the first encoding method is based on time-frequency transform and transform coefficient quantization, and the second encoding method is a linear-predication-based encoding method, andwherein the first encoding method is determined to be used to encode the current audio frame based on a condition that the first minimum bandwidth is less than a first preset value, or the second encoding method is determined to be used to encode the current audio frame based on a condition that the first minimum bandwidth is greater than the first preset value.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 20)
- - 2. The method according to claim 1, wherein the determining the average value of minimum bandwidths of the first preset proportion of the energy of the N audio frames comprises:
    - sorting the energy of the P FFT energy spectrum coefficients of each audio frame in descending order;
      
      comparing energy obtained after each time of accumulation with the total energy of the audio frame, and if a proportion is greater than the first preset proportion, ending the accumulation process, where a quantity of times of accumulation is the minimum bandwidth; and
      
      determining the average value of minimum bandwidths according to the minimum bandwidth of distribution, on the spectrum, of the energy that accounts for not less than the first preset proportion of each of the N audio frames.
  - 3. The method according to claim 1, wherein the general sparseness parameter comprises a first energy proportion, and whereinthe determining the general sparseness parameter comprises:
    - selecting P₁FFT energy spectrum coefficients from the P FFT energy spectrum coefficients of each of the N audio frames; and
      
      determining the first energy proportion according to energy of the P₁FFT energy spectrum coefficients of each of the N audio frames and total energy of the N audio frames, wherein P₁is a positive integer less than P,wherein the first encoding method is determined to be used to encode the current audio frame based on a condition that the first energy proportion is greater than a second preset value, or the second encoding method is determined to be used to encode the current audio frame based on a condition that the first energy proportion is less than the second preset value.
  - 4. The method according to claim 3, wherein energy of any one of the P₁FFT energy spectrum coefficients is greater than energy of any one of FFT energy spectrum coefficients in the P FFT energy spectrum coefficients other than the P₁FFT energy spectrum coefficients.
  - 5. The method according to claim 1, wherein the general sparseness parameter comprises a second minimum bandwidth and a third minimum bandwidth, and whereinthe determining the general sparseness parameter comprises:
    - determining an average value of minimum bandwidths of distribution, on the spectrums, of a second preset proportion of the energy of the N audio frames according to the energy of the P FFT energy spectrum coefficients of each of the N audio frames; and
      
      determining an average value of minimum bandwidths of distribution, on the spectrums, of a third preset proportion of the energy of the N audio frames according to the energy of the P FFT energy spectrum coefficients of each of the N audio frames,wherein the average value of the minimum bandwidths of the second preset proportion of the energy of the N audio frames is used as the second minimum bandwidth,wherein the average value of the minimum bandwidths of the third preset proportion of the energy of the N audio frames is used as the third minimum bandwidth,wherein the second preset proportion is less than the third preset proportion,wherein the first encoding method is determined to be used to encode the current audio frame based on a condition that the second minimum bandwidth is less than a third preset value and the third minimum bandwidth is less than a fourth preset value, orthe first encoding method is determined to be used to encode the current audio frame based on a condition that the third minimum bandwidth is less than a fifth preset value, orthe second encoding method is determined to be used to encode the current audio frame based on a condition that the third minimum bandwidth is greater than a sixth preset value, and whereinthe fourth preset value is greater than or equal to the third preset value, the fifth preset value is less than the fourth preset value, and the sixth preset value is greater than the fourth preset value.
  - 6. The method according to claim 5, wherein the determining the average value of minimum bandwidths of the second preset proportion of the energy of the N audio frames and the determining the average value of minimum bandwidths of the third preset proportion of the energy of the N audio frames comprises:
    - sorting the energy of the P FFT energy spectrum coefficients of each audio frame in descending order;
      
      determining, according to the sorted energy of the P FFT energy spectrum coefficients of each audio frame, a minimum bandwidth of distribution, on the spectrum, of energy that accounts for not less than the second preset proportion of each of the N audio frames;
      
      determining, according to the minimum bandwidth of distribution, on the spectrum, of the energy that accounts for not less than the second preset proportion of each of the N audio frames, an average value of minimum bandwidths of distribution, on the spectrums, of energy that accounts for not less than the second preset proportion of the N audio frames;
      
      determining, according to the energy, sorted in descending order, of the P FFT energy spectrum coefficients of each of the N audio frames, a minimum bandwidth of distribution, on the spectrum, of energy that accounts for not less than the third preset proportion of each of the N audio frames; and
      
      determining, according to the minimum bandwidth of distribution, on the spectrum, of the energy that accounts for not less than the third preset proportion of each of the N audio frames, an average value of minimum bandwidths of distribution, on the spectrums, of energy that accounts for not less than the third preset proportion of the N audio frames.
  - 7. The method according to claim 1, wherein the general sparseness parameter comprises a second energy proportion and a third energy proportion, and whereinthe determining the general sparseness parameter comprises:
    - determining the second energy proportion according to energy of P₂FFT energy spectrum coefficients of each of the N audio frames and total energy of the N audio frames;
      
      determining the third energy proportion according to energy of P₃FFT energy spectrum coefficients of each of the N audio frames and the total energy of the N audio frames, wherein P₂and P₃are positive integers less than P, and P₂is less than P₃,and wherein the first encoding method is determined to be used to encode the current audio frame based on a condition that the second energy proportion is greater than a seventh preset value and the third energy proportion is greater than an eighth preset value, orthe first encoding method is determined to be used to encode the current audio frame based on a condition that the second energy proportion is greater than a ninth preset value, orthe second encoding method is determined to be used to encode the current audio frame based on a condition that the third energy proportion is less than a tenth preset value.
  - 8. The method according to claim 7, wherein the P₂FFT energy spectrum coefficients have maximum energy among possible selections of P₂FFT energy spectrum coefficients from the P FFT energy spectrum coefficients, and whereinthe P₃FFT energy spectrum coefficients have maximum energy among possible selections of P₃FFT energy spectrum coefficients from the P FFT energy spectrum coefficients.
  - 9. The method according to claim 1, wherein the N is 1.
  - 10. The method according to claim 1, wherein the first encoding method is not based on linear prediction.
  - 20. The audio encoder according to claim 10, wherein the first encoding method is not based on linear prediction.

11. An audio encoder, comprising:
- a memory comprising instructions; and
  
  one or more processors in communication with the memory, wherein the one or more processors execute the instructions to;
  
  divide an energy spectrum of each of N audio frames into P fast Fourier transform (FFT) energy spectrum coefficients, wherein P and N are positive integers, and the N audio frames comprise a current audio frame;
  
  determine a general sparseness parameter according to energy of the P FFT energy spectrum coefficients of each of the N audio frames by determining an average value of minimum bandwidths of distribution on the spectrums of a first preset proportion energy of the N audio frames according to the energy of the P FFT energy spectrum coefficients of each of the N audio frames, wherein the general sparseness parameter comprises a first minimum bandwidth, wherein the average value of the minimum bandwidths of the distribution on the spectrums of the first preset proportion of the energy of the N audio frames is used as first minimum bandwidth, and wherein the general sparseness parameter indicates sparseness of distribution in energy spectrums of the N audio frames; and
  
  determine, based on the sparseness of distribution, whether to use a first encoding method or a second encoding method to encode the current audio frame, wherein the first encoding method is based on time-frequency transform and transform coefficient quantization, and the second encoding method is a linear-predication-based encoding method,and wherein the first encoding method is determined to be used to encode the current audio frame based on a condition that the first minimum bandwidth is less than a first preset value, or the second encoding method is determined to be used to encode the current audio frame based on a condition that the first minimum bandwidth is greater than the first preset value.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- - 12. The audio encoder according to claim 11, wherein, to determine the average value of minimum bandwidths, the one or more processors execute the instructions to:
    - sort the energy of the P FFT energy spectrum coefficients of each audio frame in descending order;
      
      compare energy obtained after each time of accumulation with the total energy of the audio frame, and if a proportion is greater than the first preset proportion, end the accumulation process, where a quantity of times of accumulation is the minimum bandwidth; and
      
      determine the average value of minimum bandwidths according to the minimum bandwidth of distribution, on the spectrum, of the energy that accounts for not less than the first preset proportion of each of the N audio frames.
  - 13. The audio encoder according to claim 11, wherein the general sparseness parameter comprises a first energy proportion, and wherein,to determine the general sparseness parameter, the one or more processors execute the instructions to:
    - select P₁FFT energy spectrum coefficients from the P FFT energy spectrum coefficients of each of the N audio frames, and determine the first energy proportion according to energy of the P₁FFT energy spectrum coefficients of each of the N audio frames and total energy of the N audio frames, wherein P₁is a positive integer less than P, andwherein the first encoding method is determined to be used to encode the current audio frame based on a condition that the first energy proportion is greater than a second preset value, or the second encoding method is determined to be used to encode the current audio frame based on a condition that the first energy proportion is less than the second preset value.
  - 14. The audio encoder according to claim 13, wherein energy of any one of the P₁FFT energy spectrum coefficients is greater than energy of any one of FFT energy spectrum coefficients in the P FFT energy spectrum coefficients other than the P₁FFT energy spectrum coefficients.
  - 15. The audio encoder according to claim 11, wherein the general sparseness parameter comprises a second minimum bandwidth and a third minimum bandwidth, and wherein,to determine the general sparseness parameter, the one or more processors execute the instructions to:
    - determine an average value of minimum bandwidths of distribution, on the spectrums, of a second preset proportion of the energy of the N audio frames according to the energy of the P FFT energy spectrum coefficients of each of the N audio frames and determine an average value of minimum bandwidths of distribution, on the spectrums, of third preset proportion energy of the N audio frames according to the energy of the P FFT energy spectrum coefficients of each of the N audio frames, wherein the average value of the minimum bandwidths of the second preset proportion of the energy of the N audio frames is used as the second minimum bandwidth, the average value of the minimum bandwidths of the third preset proportion of the energy of the N audio frames is used as the third minimum bandwidth, and the second preset proportion is less than the third preset proportion,wherein the first encoding method is determined to be used to encode the current audio frame based on a condition that the second minimum bandwidth is less than a third preset value and the third minimum bandwidth is less than a fourth preset value, or the first encoding method is determined to be used to encode the current audio frame based on a condition that the third minimum bandwidth is less than a fifth preset value, or the second encoding method is determined to be used to encode the current audio frame based on a condition that the third minimum bandwidth is greater than a sixth preset value, andwherein the fourth preset value is greater than or equal to the third preset value, the fifth preset value is less than the fourth preset value, and the sixth preset value is greater than the fourth preset value.
  - 16. The audio encoder according to claim 15, wherein, to determine the average value of minimum bandwidths, the one or more processors execute the instructions to:
    - sort the energy of the P FFT energy spectrum coefficients of each audio frame in descending order;
      
      determine, according to the sorted energy of the P FFT energy spectrum coefficients of each, a minimum bandwidth of distribution, on the spectrum, of energy that accounts for not less than the second preset proportion of each of the N audio frames;
      
      determine, according to the minimum bandwidth of distribution, on the spectrum, of the energy that accounts for not less than the second preset proportion of each of the N audio frames, an average value of minimum bandwidths of distribution, on the spectrums, of energy that accounts for not less than the second preset proportion of the N audio frames;
      
      determine, according to the energy, sorted in descending order, of the P FFT energy spectrum coefficients of each of the N audio frames, a minimum bandwidth of distribution, on the spectrum, of energy that accounts for not less than the third preset proportion of each of the N audio frames; and
      
      determine, according to the minimum bandwidth of distribution, on the spectrum, of the energy that accounts for not less than the third preset proportion of each of the N audio frames, an average value of minimum bandwidths of distribution, on the spectrums, of energy that accounts for not less than the third preset proportion of the N audio frames.
  - 17. The audio encoder according to claim 11, wherein the general sparseness parameter comprises a second energy proportion and a third energy proportion, and whereinto determine the general sparseness parameter, the one or more processors the execute instructions to:
    - determine the second energy proportion according to energy of P₂FFT energy spectrum coefficients of each of the N audio frames and total energy of the respective N audio frames;
      
      determine the third energy proportion according to energy of P₃FFT energy spectrum coefficients of each of the N audio frames and the total energy of the N audio frames, wherein P₂and P₃are positive integers less than P, and P₂is less than P₃; and
      
      wherein the first encoding method is determined to be used to encode the current audio frame based on a condition that the second energy proportion is greater than a seventh preset value and the third energy proportion is greater than an eighth preset value, or the first encoding method is determined to be used to encode the current audio frame based on a condition that the second energy proportion is greater than a ninth preset value, or the second encoding method is determined to be used to encode the current audio frame based on a condition that the third energy proportion is less than a tenth preset value.
  - 18. The audio encoder according to claim 17, wherein the P₂FFT energy spectrum coefficients have maximum energy among possible selections of P₂FFT energy spectrum coefficients from the P FFT energy spectrum coefficients, andwherein the P₃FFT energy spectrum coefficients have maximum energy among possible selections of P₃FFT energy spectrum coefficients from the P FFT energy spectrum coefficients.
  - 19. The audio encoder according to claim 11, wherein the N is 1.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Top Quality Telephony LLC
Original Assignee
Huawei Technologies Co., Ltd. (Huawei Investment & Holding Co., Ltd.)
Inventors
Wang, Zhe
Primary Examiner(s)
He, Jialong

Application Number

US15/682,097
Publication Number

US 20170345436A1
Time in Patent Office

687 Days
Field of Search
US Class Current
CPC Class Codes

G10L 19/02   using spectral analysis, e....

G10L 19/0204   using subband decomposition

G10L 19/035   Scalar quantisation

G10L 19/04   using predictive techniques

G10L 19/06   Determination or coding of ...

G10L 19/20   using sound class specific ...

G10L 19/22   Mode decision, i.e. based o...

G10L 25/03   characterised by the type o...

Audio encoding method and apparatus

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Audio encoding method and apparatus

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links