Pitch-based frequency domain voice removal
First Claim
1. A method for modifying an audio signal with a pitch-based signal component removal device, comprising:
- detecting a first most prominent pitch associated with the audio signal including;
transforming the audio signal into a time frequency domain to generate short time frequency spectra for the audio signal;
obtaining a plurality of pitch candidate frequency domain combs associated with a plurality of pitch candidates;
performing a cross-correlation in the frequency domain using information associated with the short time frequency spectra and the plurality of pitch candidate frequency domain combs in generating cross-correlation values; and
identifying the pitch candidate associated with a first maximum value among the cross-correlation values as the first most prominent pitch;
detecting a second most prominent pitch associated with the audio signal by removing cross-correlation values associated with the first most prominent pitch from consideration and identifying the pitch candidate associated with a second maximum value among the remaining cross-correlation values as the second most prominent pitch; and
in the event the second most prominent pitch is associated with voice, modifying in the audio signal a portion that is associated with the second most prominent pitch.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method are disclosed for modifying an audio signal. A pitch associated with the audio signal is detected. A portion of the audio signal that is associated with the detected pitch is modified. Controlling the modification of a primary audio signal is disclosed. The level of a secondary audio signal is monitored. Modification of the primary audio signal is enabled if the level of the secondary audio signal rises above a first prescribed threshold at a time when the primary audio signal is not being modified. Modification of the primary audio signal is disabled if the level of the secondary audio signal drops below a second prescribed threshold at a time when the primary audio signal is being modified.
28 Citations
61 Claims
-
1. A method for modifying an audio signal with a pitch-based signal component removal device, comprising:
-
detecting a first most prominent pitch associated with the audio signal including; transforming the audio signal into a time frequency domain to generate short time frequency spectra for the audio signal; obtaining a plurality of pitch candidate frequency domain combs associated with a plurality of pitch candidates; performing a cross-correlation in the frequency domain using information associated with the short time frequency spectra and the plurality of pitch candidate frequency domain combs in generating cross-correlation values; and identifying the pitch candidate associated with a first maximum value among the cross-correlation values as the first most prominent pitch; detecting a second most prominent pitch associated with the audio signal by removing cross-correlation values associated with the first most prominent pitch from consideration and identifying the pitch candidate associated with a second maximum value among the remaining cross-correlation values as the second most prominent pitch; and in the event the second most prominent pitch is associated with voice, modifying in the audio signal a portion that is associated with the second most prominent pitch. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55)
-
-
56. A system for modifying an audio signal, comprising:
-
an input connection configured to receive the audio signal; and a processor configured to; detect a first most prominent pitch associated with the audio signal, including by; transforming the audio signal into a time frequency domain to generate short time frequency spectra for the audio signal; obtaining a plurality of pitch candidate frequency domain combs associated with a plurality of pitch candidates; performing a cross-correlation in the frequency domain using information associated with the short time frequency spectra and the plurality of pitch candidate frequency domain combs in generating cross-correlation values; and identifying the pitch candidate associated with a first maximum value among the cross-correlation values as the first most prominent pitch; detect a second most prominent pitch associated with the audio signal by removing cross-correlation values associated with the first most prominent pitch from consideration and identifying the pitch candidate associated with a second maximum value among the remaining cross-correlation values as the second most prominent pitch; and in the event the second most prominent pitch is associated with voice, modify in the audio signal a portion that is associated with the second most prominent pitch. - View Dependent Claims (57)
-
-
58. A system for modifying an audio signal, comprising:
-
means for detecting a first most prominent pitch associated with the audio signal, including; transforming the audio signal into a time frequency domain to generate short time frequency spectra for the audio signal; obtaining a plurality of pitch candidate frequency domain combs associated with a plurality of pitch candidates; performing a cross-correlation in the frequency domain using information associated with the short time frequency spectra and the plurality of pitch candidate frequency domain combs in generating cross-correlation values; and identifying the pitch candidate associated with a first maximum value among the cross-correlation values as the first most prominent pitch; means for detecting a second most prominent pitch associated with the audio signal that includes means for removing cross-correlation values associated with the first most prominent pitch from consideration and identifying the pitch candidate associated with a second maximum value among the remaining cross-correlation values as the second most prominent pitch; and means for modifying in the audio signal a portion that is associated with the second most prominent pitch in the event the second most prominent pitch is associated with voice. - View Dependent Claims (59)
-
-
60. A computer program product for modifying an audio signal, the computer program product being embodied in a non-transitory computer readable medium and comprising computer instructions for:
-
detecting a first most prominent pitch associated with the audio signal including; transforming the audio signal into a time frequency domain to generate short time frequency spectra for the audio signal; obtaining a plurality of pitch candidate frequency domain combs associated with a plurality of pitch candidates; performing a cross-correlation in the frequency domain using information associated with the short time frequency spectra and the plurality of pitch candidate frequency domain combs in generating cross-correlation values; and identifying the pitch candidate associated with a first maximum value among the cross-correlation values as the first most prominent pitch; detecting a second most prominent pitch associated with the audio signal by removing cross-correlation values associated with the first most prominent pitch from consideration and identifying the pitch candidate associated with a second maximum value among the remaining cross-correlation values as the second most prominent pitch; and in the event the second most prominent pitch is associated with voice, modifying in the audio signal a portion that is associated with the second most prominent pitch. - View Dependent Claims (61)
-
Specification