Voice Activity Detection Method and Apparatus, and Electronic Device
First Claim
1. A voice activity detection method, comprising:
- obtaining a time domain parameter and a frequency domain parameter from a current audio frame to be detected;
obtaining a first distance between the time domain parameter and a long-term slip mean of the time domain parameter in a history background noise frame;
obtaining a second distance between the frequency domain parameter and a long-term slip mean of the frequency domain parameter in the history background noise frame; and
judging whether the current audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance, and a set of decision inequalities based on the first distance and the second distance,wherein at least one coefficient in the set of decision inequalities is a variable determined according to a voice activity detection operation mode or features of an input signal.
2 Assignments
0 Petitions
Accused Products
Abstract
A voice activity detection method and apparatus, and an electronic device are provided. The method includes: obtaining a time domain parameter and a frequency domain parameter from an audio frame; obtaining a first distance between the time domain parameter and a long-term slip mean of the time domain parameter in a history background noise frame, and obtaining a second distance between the frequency domain parameter and a long-term slip mean of the frequency domain parameter in the history background noise frame; and judging whether the audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance and a set of decision inequalities based on the first distance and the second distance. The above technical solutions enable the judgment criterion to have an adaptive adjustment capability, thus improving the performance of the voice activity detection.
-
Citations
18 Claims
-
1. A voice activity detection method, comprising:
-
obtaining a time domain parameter and a frequency domain parameter from a current audio frame to be detected; obtaining a first distance between the time domain parameter and a long-term slip mean of the time domain parameter in a history background noise frame; obtaining a second distance between the frequency domain parameter and a long-term slip mean of the frequency domain parameter in the history background noise frame; and judging whether the current audio frame is a foreground voice frame or a background noise frame according to the first distance, the second distance, and a set of decision inequalities based on the first distance and the second distance, wherein at least one coefficient in the set of decision inequalities is a variable determined according to a voice activity detection operation mode or features of an input signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A voice activity detection apparatus, comprising:
-
a first obtaining module, configured to obtain a time domain parameter and a frequency domain parameter from a current audio frame to be detected; a second obtaining module, configured to obtain a first distance between the time domain parameter and a long-term slip mean of the time domain parameter in a history background noise frame, and obtain a second distance between the frequency domain parameter and a long-term slip mean of the frequency domain parameter in the history background noise frame; and a judging module, configured to judge whether the current audio frame to be detected is a foreground voice frame or a background noise frame according to the first distance, the second distance, and a set of decision inequalities based on the first distance and the second distance, wherein at least one coefficient in the set of decision inequalities is a variable determined according to a voice activity detection operation mode or features of an input signal. - View Dependent Claims (15, 16, 17, 18)
-
Specification