Speech signal processing method and apparatus
First Claim
1. A speech signal processing method performed at a terminal device having one or more processors, a microphone, a speaker, and memory storing one or more programs to be executed by the one or more processors, the method comprising:
- receiving a to-be-output speech signal transmitted from another terminal device to the terminal device via a network;
obtaining a signal recorded by the microphone, the recorded signal including a noise signal and an echo signal, wherein the noise signal is detected from a near-end environment and the echo signal is detected from the speaker;
before outputting the speech signal via the speaker;
calculating a loop transfer function according to the recorded signal and the speech signal, wherein the loop transfer function indicates a correlation between the recorded signal and the speech signal;
calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function;
calculating a frequency weighted coefficient according to the power spectrum of the echo signal and the power spectrum of the noise signal, wherein the frequency weighted coefficient corresponds to a weakest frequency at which the noise signal has lowest energy;
adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient by increasing the frequency amplitude of the speech signal at the weakest frequency using the frequency weighted coefficient; and
after adjusting the frequency amplitude of the speech signal;
outputting the adjusted speech signal via the speaker.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech signal processing method is performed at a terminal device, including: obtaining a recorded signal and a to-be-output speech signal, the recorded signal including a noise signal and an echo signal; calculating a loop transfer function according to the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the two power spectra of the echo signal and the noise signal; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient; and outputting the adjusted speech signal to a speaker electrically coupled to the terminal device. As such, the frequency amplitude of the speech signal is automatically adjusted according to the relative frequency distribution of a noise signal and the speech signal.
-
Citations
19 Claims
-
1. A speech signal processing method performed at a terminal device having one or more processors, a microphone, a speaker, and memory storing one or more programs to be executed by the one or more processors, the method comprising:
-
receiving a to-be-output speech signal transmitted from another terminal device to the terminal device via a network; obtaining a signal recorded by the microphone, the recorded signal including a noise signal and an echo signal, wherein the noise signal is detected from a near-end environment and the echo signal is detected from the speaker; before outputting the speech signal via the speaker; calculating a loop transfer function according to the recorded signal and the speech signal, wherein the loop transfer function indicates a correlation between the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the power spectrum of the echo signal and the power spectrum of the noise signal, wherein the frequency weighted coefficient corresponds to a weakest frequency at which the noise signal has lowest energy; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient by increasing the frequency amplitude of the speech signal at the weakest frequency using the frequency weighted coefficient; and after adjusting the frequency amplitude of the speech signal; outputting the adjusted speech signal via the speaker. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A terminal device, comprising:
-
at least one processor; a microphone; a speaker; memory; and a plurality of program instructions that, when executed by the at least one processor, cause the terminal device to perform the following operations; receiving a to-be-output speech signal transmitted from another terminal device to the terminal device via a network; obtaining a signal recorded by the microphone, the recorded signal including a noise signal and an echo signal, wherein the noise signal is detected from a near-end environment and the echo signal is detected from the speaker; before outputting the speech signal via the speaker; calculating a loop transfer function according to the recorded signal and the speech signal, wherein the loop transfer function indicates a correlation between the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the power spectrum of the echo signal and the power spectrum of the noise signal, wherein the frequency weighted coefficient corresponds to a weakest frequency at which the noise signal has lowest energy; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient by increasing the frequency amplitude of the speech signal at the weakest frequency using the frequency weighted coefficient; and after adjusting the frequency amplitude of the speech signal; outputting the adjusted speech signal via the speaker. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A non-transitory computer readable storage medium in connection with a terminal device having one or more processors, a microphone, and a speaker, the storage medium storing a plurality of program instructions that, when executed by the one or more processors, cause the terminal device to perform the following operations:
-
receiving a to-be-output speech signal transmitted from another terminal device to the terminal device via a network; obtaining a signal recorded by the microphone, the recorded signal including a noise signal and an echo signal, wherein the noise signal is detected from a near-end environment and the echo signal is detected from the speaker; before outputting the speech signal via the speaker; calculating a loop transfer function according to the recorded signal and the speech signal, wherein the loop transfer function indicates a correlation between the recorded signal and the speech signal; calculating a power spectrum of the echo signal and a power spectrum of the noise signal according to the recorded signal, the speech signal, and the loop transfer function; calculating a frequency weighted coefficient according to the power spectrum of the echo signal and the power spectrum of the noise signal, wherein the frequency weighted coefficient corresponds to a weakest frequency at which the noise signal has lowest energy; adjusting a frequency amplitude of the speech signal based on the frequency weighted coefficient by increasing the frequency amplitude of the speech signal at the weakest frequency using the frequency weighted coefficient; and after adjusting the frequency amplitude of the speech signal; outputting the adjusted speech signal via the speaker. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification