Speech playback speed change using wavelet coding, preferably sub-band coding
DCFirst Claim
1. A method of changing the playback speed of a digitised time domain audio signal which has been transformed into a wavelet coded audio signal comprising a stream of frames, comprising:
- selecting periodic ones of frames from said stream of wavelet coded frames;
modifying said stream of wavelet coded frames by dropping said selected frames from said wavelet coded audio signal to leave a modified stream of frames or by replicating said selected frames and including said replicated frames in said wavelet coded audio signal to form a modified stream of frames;
wavelet decoding consecutive frames of said modified stream of frames to construct a modified time domain signal which approximates pitch of said digitised time domain audio signal but has a different playback speed.
11 Assignments
Litigations
0 Petitions
Accused Products
Abstract
A method of speeding up playback of a digitized audio signal without raising the pitch and without introducing discontinuities in the speech signal, comprises sub-band coding (SBC) consecutive blocks of the audio signal with standard SBC or wavelet compression to derive frames of data. Next periodic adjacent pairs of the frames are dropped to leave a stream of remaining frames. A sped up approximation of the digitized audio signal is then reconstructed by sub-band decoding consecutive remaining frames. The method can also be used to slow speech playback by replicating, rather than dropping, adjacent pairs of frames.
43 Citations
12 Claims
-
1. A method of changing the playback speed of a digitised time domain audio signal which has been transformed into a wavelet coded audio signal comprising a stream of frames, comprising:
-
selecting periodic ones of frames from said stream of wavelet coded frames; modifying said stream of wavelet coded frames by dropping said selected frames from said wavelet coded audio signal to leave a modified stream of frames or by replicating said selected frames and including said replicated frames in said wavelet coded audio signal to form a modified stream of frames; wavelet decoding consecutive frames of said modified stream of frames to construct a modified time domain signal which approximates pitch of said digitised time domain audio signal but has a different playback speed. - View Dependent Claims (2, 3)
-
-
4. A method of operating upon a wavelet coded audio signal comprising stream of frames in order to slow the speaking rate in respect of a digitised time domain signal from which said wavelet coded audio signal was derived comprising:
-
replicating periodic ones of said frames in said stream of frames and including said replicated frames in said wavelet coded audio signal to form a modified stream of frames with periodic adjacent identical sequences of frames; wavelet decoding consecutive frames of said modified stream of frames to construct a modified time domain signal which, when played back, approximates pitch of said digitised time domain audio signal but has a slower speaking rate.
-
-
5. A method of speeding up playback of a digitised time domain audio signal, comprising:
-
wavelet encoding by progressively filtering each of consecutive blocks of said time domain audio signal with finite impulse response (FIR) low pass filters (LPFs) and with FIR high pass filters (HPFs) to obtain, for each block, a plurality of wavelet domain sub-blocks, each wavelet domain sub-block of said plurality of wavelet domain sub-blocks having audio signal samples spanning a frequency band; building a plurality of wavelet domain data frames, each wavelet domain data frame built from a plurality of wavelet domain sub-blocks derived from a given time domain block; dropping periodic ones of said wavelet domain data frames to leave a stream of remaining wavelet domain data frames; filtering consecutive frames in said stream of remaining wavelet domain data frames with FIR LPFs and FIR HPFs to construct a time domain signal which, on playback, approximates pitch of said digitised time domain audio signal but has a faster speaking rate. - View Dependent Claims (6, 7, 8)
-
-
9. A method of changing the speaking rate in respect of a digitised time domain audio signal which has been transformed into a wavelet coded audio signal comprising a stream of wavelet coded frames, comprising:
-
selecting periodic pairs of adjacent frames in said stream of wavelet coded frames; modifying said stream of wavelet coded frames by dropping said selected pairs of adjacent frames from said stream of wavelet coded frames to leave a modified stream of frames or replicating said selected pairs of adjacent frames and including said replicated frames in said wavelet coded audio signal to form a modified stream of wavelet coded frames; wavelet decoding consecutive frames of said modified stream of frames to construct a modified digitised time domain audio signal which, on playback, approximates pitch of said digitised time domain audio signal but has a different speaking rate. - View Dependent Claims (10)
-
-
11. Apparatus for changing the speaking rate in respect of a digitised time domain audio signal which has been transformed into a wavelet coded audio signal comprising a stream of wavelet coded frames, comprising:
-
means for selecting periodic pairs of adjacent frames of said wavelet coded audio signal; means for modifying said wavelet coded audio signal by dropping said selected pairs of adjacent frames from said wavelet coded audio signal to leave a stream of frames or replicating said selected pairs of adjacent frames in said wavelet coded audio signal to form a stream of frames including each replicated pair of adjacent frames; and means for wavelet decoding consecutive frames of said modified stream of frames to construct a modified digitised time domain audio signal which, on playback, approximates pitch of said digitised time domain audio signal but has a different speaking rate. - View Dependent Claims (12)
-
Specification