Method and system for detecting voice activity based on cross-correlation
First Claim
1. A method, comprising:
- receiving coded speech signals;
partitioning the coded speech signals into data frames; and
for each of at least some of the data frames, determining whether the data frame corresponds to voice or to noise, by;
determining a cross-correlation Y(τ
) of data of said data frame;
determining a periodicity of the cross-correlation;
determining a variance σ
2 of the periodicity;
determining said data frame corresponds to said noise when the cross-correlation is lower than a threshold cross-correlation value; and
determining said data frame corresponds to said voice if the variance is less than a threshold variance value.
3 Assignments
0 Petitions
Accused Products
Abstract
A system and method is provided for determining whether a data frame of a coded speech signal corresponds to voice or to noise. In one embodiment, a voice activity detector determines a cross-correlation of data. If the cross-correlation is lower than a predetermined cross-correlation value, then the data frame corresponds to noise. If not, then the voice activity detector determines a periodicity of the cross-correlation and a variance of the periodicity. If the variance is less than a predetermined variance value, then the data frame corresponds to voice. In another embodiment, a method determines energy of the data frame and an average energy of the coded speech signal. If the data frame is one of a predetermined number of initial data frames, then a comparison between the average energy to the energy of the data frame is used to determine whether the data frame is noise or voice.
-
Citations
19 Claims
-
1. A method, comprising:
-
receiving coded speech signals; partitioning the coded speech signals into data frames; and for each of at least some of the data frames, determining whether the data frame corresponds to voice or to noise, by; determining a cross-correlation Y(τ
) of data of said data frame;determining a periodicity of the cross-correlation; determining a variance σ
2 of the periodicity;determining said data frame corresponds to said noise when the cross-correlation is lower than a threshold cross-correlation value; and determining said data frame corresponds to said voice if the variance is less than a threshold variance value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method, comprising:
-
receiving coded speech signals; partitioning the coded speech signals into data frames; and for each of at least some of the data frames, determining whether the data frame corresponds to voice or to noise, by; determining an energy of said data frame; determining an average speech energy of the coded speech signal; if the data frame is one of a threshold number of initial data frames of the coded speech signal, determining whether the data frame corresponds to said voice or to said noise by, determining a cross-correlation of data of said data frame, determining a periodicity of the cross-correlation, determining a variance of the periodicity; determining said data frame corresponds to said noise when the cross-correlation is lower than a threshold cross-correlation value; and determining said data frame corresponds to said voice if the variance is less than a threshold variance value; and else, comparing the energy of the data frame with the average speech energy, and determining said data frame corresponds to said voice if the average speech energy is less than or equal to the energy of the data frame. - View Dependent Claims (10, 11)
-
-
12. A voice activity detector, comprising:
-
means for determining whether a data frame of a coded speech signal corresponds to voice or to noise, including; means for determining a cross-correlation Y(τ
) of data of said data frame;means for determining a periodicity of the cross-correlation; means for determining a variance σ
2 of the periodicity;means for determining said data frame corresponds to said noise when the cross-correlation is lower than a threshold cross-correlation value; and means for determining said data frame corresponds to voice if the variance is less than a threshold variance value. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
Specification