Multiple variable threshold speech detector
First Claim
1. Apparatus for indicating the occurrence of speech in a signal indicative of both speech and noise, the apparatus including:
- means for generating (13) a representation of the average magnitude of the signal during a moving time interval;
the apparatus being characterized by classifying means (19) for receiving said representation and a noise level estimate, said classifying means generating a first output to indicate when said representation has prescribed attributes indicative of speech and a second output to indicate when said representation has prescribed attributes indicative of noise;
level estimator means (21) responsive to said first and second outputs and said representation, said level estimator means providing a noise level estimate using the portion of said representation identified by the occurrence of said second output, said level estimator means providing a first decision level output by combining said noise level estimate and the portion of said representation defined by the occurrence of said first output in excess of a prescribed amount of said first decision level output; and
comparing means (16) for providing an output indicative of the occurrence of speech signal activity when said first decision level is exceeded by the signal.
0 Assignments
0 Petitions
Accused Products
Abstract
A speech detector uses a signal classifier (19) to identify portions of a representation of the average magnitude of a group of signal samples indicative of either speech or noise. A controller (33) in the signal classifier follows a four state sequence using appropriate time constants for signal measures in a variety of signal conditions in defining the speech and noise portions of the representation. A level estimator (21) uses selectively obtained signal measures from the defined portions of the representation to provide adaptively variable decision levels. A speech definer (16) compares the representation to a first decision level and the signal samples to a higher decision level to indicate the occurrence of speech signal activity when either decision level is exceeded. In a two way transmission arrangement, a receive trunk speech detector uses a stretcher (133) to prevent adaptation of the transmit speech detector thresholds when echo signals are present.
49 Citations
9 Claims
-
1. Apparatus for indicating the occurrence of speech in a signal indicative of both speech and noise, the apparatus including:
-
means for generating (13) a representation of the average magnitude of the signal during a moving time interval; the apparatus being characterized by classifying means (19) for receiving said representation and a noise level estimate, said classifying means generating a first output to indicate when said representation has prescribed attributes indicative of speech and a second output to indicate when said representation has prescribed attributes indicative of noise; level estimator means (21) responsive to said first and second outputs and said representation, said level estimator means providing a noise level estimate using the portion of said representation identified by the occurrence of said second output, said level estimator means providing a first decision level output by combining said noise level estimate and the portion of said representation defined by the occurrence of said first output in excess of a prescribed amount of said first decision level output; and comparing means (16) for providing an output indicative of the occurrence of speech signal activity when said first decision level is exceeded by the signal. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An arrangement for detecting speech signal activity in transmission signals indicative of speech and noise, the transmission signal traversing in a first direction, the arrangement comprising;
-
first means (18) for producing a representation of the transmission signals by weighted averaging the signal occurring over a predetermined recent interval of time; second means (31), connected to the first means, for producing and maintaining an output indicative of a peak value of the representation; third means (32), connected to the first means, for producing and maintaining an output indicative of a minimum value of the representation; controlling means (33, 34, 42,
48), including transition means capable of assuming a prescribed plurality of states occurring in a sequence responsive to predetermined signal conditions, for resetting the second and third means at different intervals according to each one of the prescribed plurality of states;fourth means (38), in circuit with outputs of the second and third means, for indicating signal activity characteristic of speech when the outputs relate to each other within a first predetermined ratio range; fifth means (39), in circuit with the outputs of the second and third means, for indicating signal activity characteristic of noise when the outputs relate to each other within a second predetermined ratio range exclusive of the first predetermined ratio range; noise level estimating means (41-43), connected to the third means and the fifth means, for comparing a stored noise level to the minimum value and altering the stored value prescribed amounts in the direction to achieve equality at intervals defined by an updating signal, the controlling means producing the updating signal after state transitions from active states and while in active states at a predetermined rate; talker level estimator means (47-49), connected to the second means and the fourth means, for comparing a stored talker level estimate with a sum indicative of the noise level estimate and the current talker level estimate and changing the stored value a prescribed amount upon the occurrence of a second updating signal and when necessary to achieve a more accurate representation of the actual talker level, the controlling means producing the second updating signal while in an idle state and the talker active state at prescribed times; and speech defining means (16) for receiving the representation of the transmission signals and the sum to provide an activity signal indicative of speech when the representation exceeds the sum. - View Dependent Claims (9)
-
Specification