×

Method and apparatus for speech analysis and synthesis

  • US 8,280,739 B2
  • Filed: 04/03/2008
  • Issued: 10/02/2012
  • Est. Priority Date: 04/04/2007
  • Status: Active Grant
First Claim
Patent Images

1. A speech analysis method, comprising the steps of:

  • obtaining a speech signal and a corresponding DEGG/EGG signal;

    providing the speech signal as the output of a vocal tract filter in a source-filter model taking the DEGG/EGG signal as the input; and

    estimating the features of the vocal tract filter from the speech signal as the output and the DEGG/EGG signal as the input, wherein the features of the vocal tract filter are expressed by the state vectors of the vocal tract filter at selected time points, and the step of estimating is performed using Kalman filtering, wherein the Kalman filtering is a two-way, bi-directional Kalman filtering comprising a forward Kalman filtering in which a future state is estimated from a past state and a backward Kalman filtering in which a past state is estimated from a future state, and wherein the forward Kalman filtering comprises forward estimation, correction and forward recursion, the backward Kalman filtering comprises backward estimation, correction and backward recursion, and estimation results of the two-way Kalman filtering are a combination of estimation results of the forward Kalman filtering and estimation results of the backward Kalman filtering, wherein Kalman filtering is based on;

    a state function
    xk=xk-1+dk, andan observation function
    vk=ekTxk+nk,wherein, xk=[xk(0), xk(1), . . . xk(N−

    1)]T represents the state vector to be estimated of the vocal tract filter at time point k, wherein xk=[xk(0), xk(1), . . . xk(N−

    1) represent N samples of the expected unit impulse response of the vocal tract filter at time k;

    dk=[dk(0), dk(1), . . . dk(N−

    1)]T represents the disturbance added to the state vector of the vocal tract filter at time k;

    ek=[ek, ek-1, . . . , ek-N+1]T is a vector, of which the element ek represents the DEGG signal inputted at time k;

    vk represents the speech signal outputted at time k; and

    nk represents the observation noise added to the outputted speech signal at time k, and whereinthe forward Kalman filtering comprises the steps of;

    forward estimation;


    xk˜

    =xk−

    1
    *,
    Pk˜

    =Pk−

    1
    +Q correction;


    Kk=Pk˜

    ek[ekTPk˜

    ek+r]

    1

    xk*=xk˜

    +Kk[vk

    e
    kTxk˜

    ]
    Pk=[I−

    K
    kekT]Pk

    forward recursion
    k=k+1;

    the backward Kalman filtering comprises the steps of;

    backward estimation;


    xk˜

    =xk+1*;


    Pk˜

    =Pk+1+Q correction;


    Kk=Pk˜

    ek[ekTPk˜

    ek+r]

    1

    xk*=xk˜

    +Kk[vk

    e
    k˜

    xk˜

    ]
    Pk=[I−

    K
    kekT]Pk˜

    backward recursion
    k=k−

    1;

    wherein, xk˜

    represents the estimated state value at time point k, xk* represents the corrected state value at time point k, Pk˜

    represents the pre-estimated value of the covariance matrix of the estimation error, Pk represents the corrected value of the covariance matrix of the estimation error, Q represents the covariance matrix of disturbance dk, Kk represents the Kalman gain, r represents the variance of the observation noise nk, I represents the unit matrix; and

    the estimation results of the two-way Kalman filtering are the combination of the estimation results of the forward Kalman filtering and those of the backward Kalman filtering using the following formula;


    Pk=(Pk+

    1
    +Pk−



    1
    )

    1
    ,
    xk*=Pk(Pk+

    1
    xk+*+Pk−



    1
    xk−

    *),wherein, Pk+, xk+ are the estimated state value and the covariance of the estimation obtained by the forward Kalman filtering respectively, and Pk−

    , xk−

    represent the estimated state value and the covariance of the estimation obtained by the backward Kalman filtering respectively.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×