Systems and methods for modifying a zero pad region of a windowed frame of an audio signal

US 7,987,089 B2
Filed: 02/14/2007
Issued: 07/26/2011
Est. Priority Date: 07/31/2006
Status: Active Grant

First Claim

Patent Images

1. A method of modifying a window with a frame associated with an audio signal, the method comprising:

Partitioning the signal into a plurality of frames;

when the plurality of frames is associated with a non-speech signal, applying a modified discrete cosine transform (MDCT) window function to each of the plurality of frames to generate a plurality of windowed frames, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−

L)/2 where L is an arbitrary value that is less than or equal to M, and 2M is a number of samples in each windowed frame.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for modifying a window with a frame associated with an audio signal is described. A signal is received. The signal is partitioned into a plurality of frames. A determination is made if a frame within the plurality of frames is associated with a non-speech signal. A modified discrete cosine transform (MDCT) window function is applied to the frame to generate a first zero pad region, where the region has a length of (M−L)/2, where L is an arbitrary value, and a second zero pad region if it was determined that the frame is associated with a non-speech signal. The frame is encoded. The decoder window is the same as the encoder window.

Citations

23 Claims

1. A method of modifying a window with a frame associated with an audio signal, the method comprising:
- Partitioning the signal into a plurality of frames;
  
  when the plurality of frames is associated with a non-speech signal, applying a modified discrete cosine transform (MDCT) window function to each of the plurality of frames to generate a plurality of windowed frames, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
  
  L)/2 where L is an arbitrary value that is less than or equal to M, and 2M is a number of samples in each windowed frame.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 23)
- - 2. The method of claim 1, further comprising encoding each of the plurality of windowed frames by applying an MDCT coding based scheme to each sample of each windowed frame of the plurality of windowed frames, wherein the windowed frames are consecutively adjacent.
  - 3. The method of claim 1, wherein each windowed frame comprises a length of 2M.
  - 4. The method of claim 1, wherein each windowed frame includes a second zero pad region, wherein the second zero pad region of each windowed frame is located at a second portion of the windowed frame.
  - 5. The method of claim 4, wherein the second zero pad region of each windowed frame has a second zero pad length of (M−
    - L)/2.
  - 6. The method of claim 5, further comprising including a present overlap region of length L within each windowed frame, wherein the present overlap region of a particular windowed frame overlaps look-ahead samples associated with a previous windowed frame.
  - 7. The method of claim 6, further comprising adding a sample associated with the present overlap region of the particular windowed frame to a corresponding look-ahead sample associated with the previous windowed frame.
  - 8. The method of claim 4, wherein L is a look-ahead region that is less than M.
  - 9. The method of claim 8, wherein the look-ahead region overlaps a future overlap region associated with a future windowed frame.
  - 10. The method of claim 6, wherein the first zero pad region and the present overlap region overlap a previous windowed frame by approximately 50%.
  - 11. The method of claim 8, wherein the second zero pad region and the look-ahead region overlap a future windowed frame by approximately 50%.
  - 12. The method of claim 1, wherein a sum of squares of each sample of a first windowed frame added with an associated sample from an overlapped windowed frame equals unity.
  - 23. The method of claim 1, further comprising, for each of the plurality of windowed frames, encoding the windowed frame by applying an MDCT coding based scheme after receiving L samples in addition to the windowed frame samples and before receiving M samples in addition to the windowed frame samples.

13. An apparatus for modifying a window with a frame associated with an audio signal comprising:
- a processor;
  
  memory in electronic communication with the processor; and
  
  instructions stored in the memory, the instructions being executable to;
  
  partition a signal into a plurality of frames; and
  
  when the plurality of frames is associated with a non-speech signal, apply a modified discrete cosine transform (MDCT) window function to each frame of the plurality of frames to generate a plurality of windowed frames, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
  
  L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame.
- View Dependent Claims (14, 15, 16)
- - 14. The apparatus of claim 13, wherein the instructions are further executable to encode each of the plurality of windowed frames using an MDCT coding based scheme, wherein the windowed frames are consecutively adjacent.
  - 15. The apparatus of claim 13, wherein each windowed frame comprises a length of samples equal to 2M.
  - 16. The apparatus of claim 13, wherein each windowed frame includes a second zero pad region, wherein the second zero pad region is located at a second portion of the windowed frame.

17. A system that is configured to modify a window with a frame associated with an audio signal comprising:
- means for processing;
  
  means for partitioning a signal into a plurality of frames;
  
  means for applying a modified discrete cosine transform (MDCT) window function to each frame of the plurality of frames when the plurality of frames is associated with a non-speech signal to generate a plurality of windowed frames that are consecutively adjacent, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
  
  L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame; and
  
  means for encoding each of the plurality of windowed frames using an MDCT coding based scheme.

18. A computer-readable medium configured to store a set of instructions executable to:
- partition a signal into a plurality of frames;
  
  when the plurality of frames is associated with a non-speech signal, apply a modified discrete cosine transform (MDCT) window function to each frame of the plurality of frames to generate a plurality of windowed frames that are consecutively adjacent, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
  
  L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame; and
  
  encode each of the plurality of windowed frames using an MDCT coding based scheme.

19. A method for selecting a window function to be used in calculating a modified discrete cosine transform (MDCT) of a frame, the method comprising:
- providing an algorithm to select a window function;
  
  applying the selected window function to each of a plurality of non-speech frames to produce a plurality of windowed frames, wherein the windowed frames are consecutively adjacent and each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
  
  L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame; and
  
  encoding each of the plurality of windowed frames with a modified discrete cosine transform (MDCT) coding mode based on constraints imposed on the MDCT coding mode, wherein the constraints comprise a length of the frame, a look ahead length and a delay.

20. A method comprising:
- when a portion of an audio signal is classified as speech;
  
  encoding a frame of the portion of the audio signal according to a first encoding scheme when the frame is classified as voiced speech; and
  
  encoding the frame of the portion of the audio signal according to a second encoding scheme when the frame is classified as unvoiced speech, wherein the second encoding scheme differs from the first encoding scheme;
  
  when the portion of the audio signal is classified as non-speech and the portion of the audio signal includes a current frame, a previous frame, and a subsequent frame that are consecutively adjacent frames;
  
  applying a modified discrete cosine transform (MDCT) window function to each of the current frame, the previous frame, and the subsequent frame to produce a plurality of windowed frames including a windowed current frame, a windowed previous frame, and a windowed subsequent frame, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
  
  L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame.
- View Dependent Claims (21, 22)
- - 21. The method of claim 20,wherein the windowed current frame has a 50% overlap with the windowed previous frame and a 50% overlap with the windowed subsequent frame;
    - andencoding the current windowed frame according to a modified discrete cosine transform coding scheme.
  - 22. The method of claim 20, further comprising encoding the frame of the portion of the audio signal according to a third encoding scheme when the portion of the audio signal is classified as transient speech, wherein the third encoding scheme differs from the first encoding scheme and from the second encoding scheme.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
Krishnan, Venkatesh, Kandhadai, Ananthapadmanabhan A.
Primary Examiner(s)
Smits; Talivaldis Ivars
Assistant Examiner(s)
ROBERTS, SHAUN A

Application Number

US11/674,745
Publication Number

US 20080027719A1
Time in Patent Office

1,623 Days
Field of Search

704/214, 704/208
US Class Current

704/214
CPC Class Codes

G10L 19/0212 using orthogonal transforma...

G10L 19/20 using sound class specific ...

Systems and methods for modifying a zero pad region of a windowed frame of an audio signal

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for modifying a zero pad region of a windowed frame of an audio signal

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links