Systems and methods for modifying a zero pad region of a windowed frame of an audio signal
First Claim
1. A method of modifying a window with a frame associated with an audio signal, the method comprising:
- Partitioning the signal into a plurality of frames;
when the plurality of frames is associated with a non-speech signal, applying a modified discrete cosine transform (MDCT) window function to each of the plurality of frames to generate a plurality of windowed frames, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
L)/2 where L is an arbitrary value that is less than or equal to M, and 2M is a number of samples in each windowed frame.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for modifying a window with a frame associated with an audio signal is described. A signal is received. The signal is partitioned into a plurality of frames. A determination is made if a frame within the plurality of frames is associated with a non-speech signal. A modified discrete cosine transform (MDCT) window function is applied to the frame to generate a first zero pad region, where the region has a length of (M−L)/2, where L is an arbitrary value, and a second zero pad region if it was determined that the frame is associated with a non-speech signal. The frame is encoded. The decoder window is the same as the encoder window.
-
Citations
23 Claims
-
1. A method of modifying a window with a frame associated with an audio signal, the method comprising:
-
Partitioning the signal into a plurality of frames; when the plurality of frames is associated with a non-speech signal, applying a modified discrete cosine transform (MDCT) window function to each of the plurality of frames to generate a plurality of windowed frames, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
L)/2 where L is an arbitrary value that is less than or equal to M, and 2M is a number of samples in each windowed frame. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 23)
-
-
13. An apparatus for modifying a window with a frame associated with an audio signal comprising:
-
a processor; memory in electronic communication with the processor; and instructions stored in the memory, the instructions being executable to; partition a signal into a plurality of frames; and when the plurality of frames is associated with a non-speech signal, apply a modified discrete cosine transform (MDCT) window function to each frame of the plurality of frames to generate a plurality of windowed frames, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame. - View Dependent Claims (14, 15, 16)
-
-
17. A system that is configured to modify a window with a frame associated with an audio signal comprising:
-
means for processing; means for partitioning a signal into a plurality of frames; means for applying a modified discrete cosine transform (MDCT) window function to each frame of the plurality of frames when the plurality of frames is associated with a non-speech signal to generate a plurality of windowed frames that are consecutively adjacent, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame; andmeans for encoding each of the plurality of windowed frames using an MDCT coding based scheme.
-
-
18. A computer-readable medium configured to store a set of instructions executable to:
-
partition a signal into a plurality of frames; when the plurality of frames is associated with a non-speech signal, apply a modified discrete cosine transform (MDCT) window function to each frame of the plurality of frames to generate a plurality of windowed frames that are consecutively adjacent, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame; and
encode each of the plurality of windowed frames using an MDCT coding based scheme.
-
-
19. A method for selecting a window function to be used in calculating a modified discrete cosine transform (MDCT) of a frame, the method comprising:
-
providing an algorithm to select a window function; applying the selected window function to each of a plurality of non-speech frames to produce a plurality of windowed frames, wherein the windowed frames are consecutively adjacent and each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame; andencoding each of the plurality of windowed frames with a modified discrete cosine transform (MDCT) coding mode based on constraints imposed on the MDCT coding mode, wherein the constraints comprise a length of the frame, a look ahead length and a delay.
-
-
20. A method comprising:
-
when a portion of an audio signal is classified as speech; encoding a frame of the portion of the audio signal according to a first encoding scheme when the frame is classified as voiced speech; and encoding the frame of the portion of the audio signal according to a second encoding scheme when the frame is classified as unvoiced speech, wherein the second encoding scheme differs from the first encoding scheme; when the portion of the audio signal is classified as non-speech and the portion of the audio signal includes a current frame, a previous frame, and a subsequent frame that are consecutively adjacent frames; applying a modified discrete cosine transform (MDCT) window function to each of the current frame, the previous frame, and the subsequent frame to produce a plurality of windowed frames including a windowed current frame, a windowed previous frame, and a windowed subsequent frame, wherein each windowed frame includes a first zero pad region that is located at a first portion of the windowed frame, wherein the first zero pad region has a length of (M−
L)/2, where L is an arbitrary value that is less than or equal to M and 2M is a number of samples in each windowed frame. - View Dependent Claims (21, 22)
-
Specification