Process for varying speech speed and device for implementing said process
First Claim
1. An apparatus for digitally varying the speed of a speech signal having a speech frequency bandwidth without measuring or substantially varying the pitch of the speech signal, including:
- means for splitting at least a portion of the speech frequency bandwidth of said speech signal into a plurality of consecutive narrow sub-band signals;
means for processing each of said sub-band signals to derive therefrom phase samples and magnitude samples representative of the sub-band signal contents expressed in polar coordinates;
means for speed varying said sub-band signals by repeating phase and magnitude samples or deleting samples therefrom at a rate depending upon the desired slowing-down or speeding-up rate respectively;
means for recombining each sub-band phase and magnitude samples into a speed varied sub-band signal; and
means for recombining said speed varied sub-band signals into recombined speech, whereby said recombined speech is a speed varied version of said speech signal having substantially the same pitch as said speech signal.
0 Assignments
0 Petitions
Accused Products
Abstract
The process for varying the speed of a speech signal that involves splitting at least a portion of the speech frequency bandwidth into N narrow sub-bands, processing each sub-hand signal contents to derive therefrom magnitude data M(i, n) and phase data P(i, n), i=1, . . . , N being the subband index and n the time index. The M (i, n) sequence is converted into a sequence M'"'"'(n) by either duplicating one sample every K samples (K being an integer value derived from the desired slowing-down/speeding up ratio). The phase sequence P (i, n) is processed to derive therefrom an increment sequence D(i, n)=P(i, n)-P(i, n-1), which increment sequence is first converted into a D'"'"'(i, n) sequence by either dropping or duplicating one sample every K, samples, before being converted into P'"'"'(i, n)=P'"'"'(i, n)+D'"'"'(i, n). The P'"'"'(i, n), D'"'"'(i, n) sequences are converted back into sub-band signals contents, then combined together into the slowed-down/speeded-up speech signal.
-
Citations
8 Claims
-
1. An apparatus for digitally varying the speed of a speech signal having a speech frequency bandwidth without measuring or substantially varying the pitch of the speech signal, including:
-
means for splitting at least a portion of the speech frequency bandwidth of said speech signal into a plurality of consecutive narrow sub-band signals; means for processing each of said sub-band signals to derive therefrom phase samples and magnitude samples representative of the sub-band signal contents expressed in polar coordinates; means for speed varying said sub-band signals by repeating phase and magnitude samples or deleting samples therefrom at a rate depending upon the desired slowing-down or speeding-up rate respectively; means for recombining each sub-band phase and magnitude samples into a speed varied sub-band signal; and means for recombining said speed varied sub-band signals into recombined speech, whereby said recombined speech is a speed varied version of said speech signal having substantially the same pitch as said speech signal.
-
-
2. An apparatus for speed varying a speech signal sampled at frequency fs without measuring or substantially varying the pitch of the speech signal, characterized in that it includes:
-
a first bank of quadrature mirror filters (QMF) for splitting a limited bandwidth of said speech signal into a plurality of N narrow sub-band signals, N being an integer value greater than 1; first down sampling means, connected to said QMF bank for down sampling each of said sub-band signals at a rate fs/N; complex quadrature mirror filtering (CQMF) means connected to said first down sampling means for converting each down sampled sub-band signal into an analytical signal represented by in-phase and quadrature components; second down sampling means connected to said CQMF for down sampling said in-phase and quadrature components to fs/2N; coordinate converting means connected to said second down sampling means for converting said analytical signal into magnitude component M(i,n) samples and phase component P(i,n) samples, with i=1. . ., N being the sub-band index and n being the time index; speed variation means connected to said coordinate converting means for deleting or repeating samples of said magnitude component M(i,n) and said phase component P(i,n) at a rate depending upon the desired speech rate variation whereby M'"'"'(i,n) data are generated from said magnitude component M(i,n) and P'"'"'(i,n) data are generated from said phase component P(i,n); coordinate converting means connected to said speed variation means for converting said M'"'"'(i,n) and P'"'"'(i,n) into rate converted analytical data u'"'"'(i,n) and v'"'"'(i,n) respectively; inverse complex QMF filtering means (ICQMF) connected to the output of said coordinate converting means for up sampling said rate converted analytical data u'"'"'(i,n) and v'"'"'(i,n) to a rate fs; and
,an inverse QMF filter bank connected to the output of said ICQMF means for providing a speed varied speech signal s'"'"'(n), said speed varied speech signal s'"'"'(n) having a pitch substantially the same as said speech signal.
-
-
3. A method for digitally varying the speed of a speech signal without measuring or substantially varying the pitch of the speech signal, said method comprising the steps of:
-
splitting at least a portion of the speech frequency bandwidth of said speech signal into a plurality of consecutive narrow sub-band signals; processing each of said sub-band signals to derive therefrom phase samples and magnitude samples representative of the subband signal contents expressed in polar coordinates; speed varying said sub-band signals by repeating phase and magnitude samples or deleting samples therefrom at a rate depending upon the desired slowing-down or speeding-up rate respectively; recombining each of said speed varied sub-band phase and magnitude samples into a speed varied sub-band signal; and recombining said recombined speed varied sub-band signals into recombined speech, whereby said recombined speech is a speed varied version of said speech signal having substantially the same pitch as said speech signal. - View Dependent Claims (4, 5, 6, 7)
-
-
8. An apparatus for speed varying a speech signal sampled at frequency fs, characterized in that it includes:
-
a first bank of quadrature mirror filters (QMF) for splitting a limited bandwidth of said speech signal into a plurality of N narrow sub-band signals, N being an integer value greater than 1; first down sampling means, connected to said QMF bank for down sampling each of said sub-band signals at a rate fs/N; complex quadrature mirror filtering (CQMF) means connected to said first down sampling means for converting each down sampled sub-band signal into an analytical signal represented by in-phase and quadrature components; second down sampling means connected to said CQMF for down sampling said in-phase and quadrature components to fs/2N; coordinate converting means connected to said second down sampling means for converting said analytical signal into magnitude component M(i,n) samples and phase component P(i,n) samples, with i=1, . . ., N being the sub-band index and n being the time index; speed variation means connected to said coordinate converting means for deleting or repeating samples of said magnitude component M(i,n) and said phase component P(i,n) at a rate depending upon the desired speech rate variation whereby M'"'"'(i,n) data are generated from said magnitude component M(i,n) and P'"'"'(i,n) data are generated from said phase component P(i,n);
said speed variation means further including;means for generating a sequence of magnitude signal components M(n) for each sub-band of said magnitude component M(i,n); means for generating a sequence of phase signal components P(n) for each sub-band of said phase component P(i,n); means for speeding up said speech signal at a rate K/K-1 K being a predetermined integer having a value greater than 1, including, for each sub-band; means for converting the sequence of magnitude signal components M(n) into a speeded-up M'"'"'(n) by deleting every Kth M(n) sample; means for generating a phase increment component sequence D(n) according to
space="preserve" listing-type="equation">D(n)=P(n)-P(n-1)means for converting the D(n) component sequence into D'"'"'(n) by deleting every Kth sample from D(n); and
,means for generating a speeded-up phase sequence
space="preserve" listing-type="equation">P'"'"'(n) with;
space="preserve" listing-type="equation">P'"'"'(n)=P'"'"'(n-1)+D'"'"'(n)means for slowing down the speech signal at a rate K/K+1 K being a predetermined integer having a value greater than 0, including for each sub-band; means for converting the sequence of magnitude signal components M(n) into a slowed-down sequence M'"'"'(n) by repeating every Kth M(n) sample; means for generating a phase increment component sequence D(n) according to
space="preserve" listing-type="equation">D(n)=P(n)-P(n-1)means for converting the D(n) component sequence into D'"'"'(n) by duplicating every Kth sample and; means for generating a slowed-down phase sequence
space="preserve" listing-type="equation">P'"'"'(n) with;
space="preserve" listing-type="equation">P'"'"'(n)=P'"'"'(n-1)+D'"'"'(n)coordinate converting means connected to said speed variation means for converting said M'"'"'(i,n) and P'"'"'(i,n) into rate converted analytical data u'"'"'(i,n) and v'"'"'(i,n) respectively; inverse complex QMF filtering means (ICQMF) connected to the output of said coordinate converting means for up sampling said rate converted analytical data u'"'"'(i,n) and v'"'"'(i,n) to a rate fs; and
,an inverse QMF filter bank connected to the output of said ICQMF means for providing a speed varied speech signal s'"'"'(n).
-
Specification