Noise segment/speech segment determination apparatus
First Claim
1. A noise segment/speech segment determination apparatus which determines whether an input signal segment is a noise segment or a speech segment, the apparatus comprising:
- an analog-to-digital conversion unit for converting a speech signal having ambient noise superimposed thereon into a digital signal;
a data extraction unit for extracting the digital signal as segment data having a predetermined duration;
an autocorrelation function computation unit for computing an autocorrelation function of the extracted data, provided that an analysis order is taken up to a “
p-order,”
R(0), R(1), R(2), . . . R(p);
an autocorrelation function normalizing unit for obtaining a normalized autocorrelation function by means of dividing the autocorrelation function by R(0);
a normalized autocorrelation function count unit for counting the number of times normalized autocorrelation functions have arisen;
a normalized autocorrelation function storage unit for storing the normalized autocorrelation functions as normalized autocorrelation function vectors (r(1), r(2), . . . r(p)));
a noise vector region/speech vector region/undefined vector computation unit which classifies and computes a plurality of normalized autocorrelation function vectors into one or a plurality of noise vector regions, one or a plurality of speech vector regions, and undefined vectors, when the number of normalized autocorrelation function vectors stored in the normalized autocorrelation function storage unit has reached a predetermined number;
a noise vector region/speech vector region/undefined vector storage unit for storing the noise vector region, the speech vector region, and undefined vectors; and
a normalized autocorrelation function vector determination unit which determines to which, if any, of a plurality of noise vector regions the latest normalized autocorrelation function vector stored in the normalized autocorrelation function storage unit pertains, and which determines the acquired signal segment as corresponding to a noise section when the vector pertains to one of the plurality of noise vector regions and determines the acquired signal segment as corresponding to a speech section when the vector does not pertain to any of the plurality of noise vector regions.
1 Assignment
0 Petitions
Accused Products
Abstract
An extraction section extracts a speech signal having ambient noise superimposed thereon as a data segment having a predetermined duration. An autocorrelation function normalizing section determines normalized autocorrelation function vectors. A normalized autocorrelation function count section counts a given number of normalized autocorrelation function vectors. A noise vector region/speech vector region/undefined vector computation section classifies the normalized autocorrelation function vectors into any of a noise vector region, a speech vector region, or undefined vectors. When the latest normalized autocorrelation function vector acquired by a normalized autocorrelation function vector determination section pertains to the noise vector region, the speech signal is determined to be a noise segment. In contrast, when the latest vector does not pertain to the noise vector region, the input signal is determined to be a speech segment.
-
Citations
12 Claims
-
1. A noise segment/speech segment determination apparatus which determines whether an input signal segment is a noise segment or a speech segment, the apparatus comprising:
-
an analog-to-digital conversion unit for converting a speech signal having ambient noise superimposed thereon into a digital signal;
a data extraction unit for extracting the digital signal as segment data having a predetermined duration;
an autocorrelation function computation unit for computing an autocorrelation function of the extracted data, provided that an analysis order is taken up to a “
p-order,”
R(0), R(1), R(2), . . . R(p);
an autocorrelation function normalizing unit for obtaining a normalized autocorrelation function by means of dividing the autocorrelation function by R(0);
a normalized autocorrelation function count unit for counting the number of times normalized autocorrelation functions have arisen;
a normalized autocorrelation function storage unit for storing the normalized autocorrelation functions as normalized autocorrelation function vectors (r(1), r(2), . . . r(p)));
a noise vector region/speech vector region/undefined vector computation unit which classifies and computes a plurality of normalized autocorrelation function vectors into one or a plurality of noise vector regions, one or a plurality of speech vector regions, and undefined vectors, when the number of normalized autocorrelation function vectors stored in the normalized autocorrelation function storage unit has reached a predetermined number;
a noise vector region/speech vector region/undefined vector storage unit for storing the noise vector region, the speech vector region, and undefined vectors; and
a normalized autocorrelation function vector determination unit which determines to which, if any, of a plurality of noise vector regions the latest normalized autocorrelation function vector stored in the normalized autocorrelation function storage unit pertains, and which determines the acquired signal segment as corresponding to a noise section when the vector pertains to one of the plurality of noise vector regions and determines the acquired signal segment as corresponding to a speech section when the vector does not pertain to any of the plurality of noise vector regions. - View Dependent Claims (2, 5, 6, 7)
-
-
3. A noise segment/speech segment determination apparatus which determines whether an input signal segment is a noise segment or a speech segment, the apparatus comprising:
-
an analog-to-digital conversion unit for converting a speech signal having ambient noise superimposed thereon into a digital signal;
a data extraction unit for extracting the digital signal as segment data having a predetermined duration;
an autocorrelation function computation unit for computing an autocorrelation function of the extracted data, provided that an analysis order is taken up to a “
p-order,”
R(0), R(1), R(2), . . . R(p);
an autocorrelation function normalizing unit for obtaining a normalized autocorrelation function by means of dividing the autocorrelation function by R(0);
a normalized autocorrelation function vector address computation unit for performing computation to determine to which one of p-order normalized autocorrelation function vector spaces that have been assigned the normalized autocorrelation function vectors beforehand and divided beforehand the normalized autocorrelation function vector pertains;
a normalized autocorrelation function count unit for counting the number of times normalized autocorrelation functions have arisen;
a normalized autocorrelation function vector/region storage unit which stores the normalized autocorrelation functions and their addresses as normalized autocorrelation function vectors (r(1), r(2), . . . r(p)); and
a normalized autocorrelation function vector region computation/determination unit which, when the number of normalized autocorrelation function vectors stored in the normalized autocorrelation function vector/region storage unit has reached a predetermined number, classifies a plurality of normalized autocorrelation function vectors into at least one noise vector regions, at least one speech vector regions, and undefined vectors and stores a result of classification into the normalized autocorrelation function vector/region storage unit;
determines to which, if any, of a plurality of noise vector regions the latest normalized autocorrelation function vector stored in the normalized autocorrelation function storage unit pertains;
determines the acquired signal segment as corresponding to a noise section when the vector pertains to one of the plurality of noise vector regions; and
determines the acquired signal segment as corresponding to a speech section when the vector does not pertain to any of the plurality of noise vector regions. - View Dependent Claims (4, 8)
-
-
9. A noise segment/speech segment determination apparatus which determines whether an input signal segment is a noise segment or a speech segment, the apparatus comprising:
-
an analog-to-digital conversion unit for converting into a digital signal a speech signal having ambient noise superimposed thereon;
a data extraction unit for extracting the digital signal as segment data having a predetermined duration;
an autocorrelation function computation unit for computing an autocorrelation function of the extracted data, provided that an analysis order is taken up to a “
p-order,”
R(0), R(1), R(2), . . . R(p);
a data storage unit for storing the digital signal extracted by the data extraction unit;
a pitch autocorrelation function computation unit for computing a pitch autocorrelation function through use of a digital signal extracted by the data extraction unit and the data stored in the data storage unit;
a pitch autocorrelation function maximum value selection/normalization unit which selects the maximum pitch autocorrelation function and normalizes the maximum pitch autocorrelation function;
a noise segment/speech segment determination unit for determining whether an acquired signal segment is a speech segment or a noise segment, through use of the maximum normalized pitch autocorrelation function;
an autocorrelation function normalizing unit for obtaining a normalized autocorrelation function by means of dividing the autocorrelation function by R(0) when the noise segment/speech segment determination unit has rendered the signal segment a noise segment;
a normalized autocorrelation function count unit for counting the number of times normalized autocorrelation functions have arisen;
a normalized autocorrelation function storage unit for storing the normalized autocorrelation function as a normalized autocorrelation function vector (r(1), r(2), . . . r(p));
a noise vector region/speech vector region/undefined vector computation section which, when the number of normalized autocorrelation function vectors stored in the normalized autocorrelation function storage unit has reached a predetermined number, computes one or a plurality of noise vector regions, one or a plurality of speech vector regions, and one or a plurality of undefined vectors;
a noise vector region/speech vector region/undefined vector storage section which stores the noise vector region, the speech vector region, and an undefined vector;
a normalized autocorrelation function vector determination unit which determines whether the latest normalized autocorrelation function vector stored in the normalized autocorrelation function storage unit pertains to the noise vector region, or to which, if any, of a plurality of noise vector regions the latest normalized autocorrelation function vector pertains;
determines the signal segment to be a noise segment when the vector pertains to the noise vector region or to one of the noise vector regions, and determines the signal segment to be a speech segment when the vector does not pertain to the noise vector region; and
a logical OR unit for producing a logical OR product from an output indicating that the normalized autocorrelation function vector determination unit has determined the signal segment to be a speech segment and from an output indicating that the noise segment/speech segment determination unit has determined the signal segment to be a speech segment, wherein the input signal segment is determined to be a noise segment or a speech segment, through use of a speech segment determination output from the logical OR unit and a noise segment determination output from the normalized autocorrelation function vector determination unit.
-
-
10. A noise segment/speech segment determination apparatus which determines whether an input signal segment is a noise segment or a speech segment, the apparatus comprising:
-
an analog-to-digital conversion unit for converting into a digital signal a speech signal having ambient noise superimposed thereon;
a data extraction unit for extracting the digital signal as segment data having a predetermined duration;
an autocorrelation function computation unit for computing an autocorrelation function of the extracted data, provided that an analysis order is taken up to a “
p-order,”
R(0), R(1), R(2), . . . R(p);
a data storage unit for storing the digital signal extracted by the data extraction unit;
a pitch autocorrelation function computation unit for computing a pitch autocorrelation function through use of a digital signal extracted by the data extraction unit and the data stored in the data storage unit;
a pitch autocorrelation function maximum value selection/normalization unit which selects the maximum pitch autocorrelation function and normalizes the maximum pitch autocorrelation function;
first-order partial autocorrelation function computation unit for computing a first-order autocorrelation function k1 determined as a ratio of autocorrelation function R(1) to autocorrelation function R(0) computed by the autocorrelation function computation unit;
a noise segment/speech segment determination unit for determining whether an acquired signal segment is a speech segment or a noise segment, through use of the maximum normalized pitch autocorrelation function and a value of the first-order partial autocorrelation function (k1);
an autocorrelation function normalizing unit for obtaining a normalized autocorrelation function by means of dividing the autocorrelation function by R(0) when the noise segment/speech segment determination unit has rendered the signal segment a noise segment;
a normalized autocorrelation function count unit for counting the number of times normalized autocorrelation functions have arisen;
a normalized autocorrelation function storage unit for storing the normalized autocorrelation function as a normalized autocorrelation function vector (r(1), r(2), . . . r(p));
a noise vector region/speech vector region/undefined vector computation section which, when the number of normalized autocorrelation function vectors stored in the normalized autocorrelation function storage unit has reached a predetermined number, computes one or a plurality of noise vector regions, one or a plurality of speech vector regions, and one or a plurality of undefined vectors;
a noise vector region/speech vector region/undefined vector storage section which stores the noise vector region, the speech vector region, and an undefined vector;
normalized autocorrelation function vector determination unit which determines whether the latest normalized autocorrelation function vector stored in the normalized autocorrelation function storage unit pertains to the noise vector region, or to which, if any, of a plurality of noise vector regions the latest normalized autocorrelation function vector pertains;
determines the signal segment to be a noise segment when the vector pertains to the noise vector region or to one of the noise vector regions, and determines the signal segment to be a speech segment when the vector does not pertain to the noise vector region; and
a logical OR unit for producing a logical OR product from an output indicating that the normalized autocorrelation function vector determination unit has determined the signal segment to be a speech segment and from an output indicating that the noise segment/speech segment determination unit has determined the signal segment to be a speech segment, wherein the input signal segment is determined to be a noise segment or a speech segment, through use of a speech segment determination output from the logical OR unit and a noise segment determination output from the normalized autocorrelation function vector determination unit.
-
-
11. A noise segment/speech segment determination apparatus which determines whether an input signal segment is a noise segment or a speech segment, the apparatus comprising:
-
an analog-to-digital conversion unit for converting into a digital signal a speech signal having ambient noise superimposed thereon;
a data extraction unit for extracting the digital signal as segment data having a predetermined duration;
an autocorrelation function computation unit for computing an autocorrelation function of the extracted data, provided that an analysis order is taken up to a “
p-order,”
R(0), R(1), R(2), . . . R(p);
a data storage unit for storing the digital signal extracted by the data extraction unit;
a pitch autocorrelation function computation unit for computing a pitch autocorrelation function through use of a digital signal extracted by the data extraction unit and the data stored in the data storage unit;
a pitch autocorrelation function maximum value selection/normalization unit which selects the maximum pitch autocorrelation function and normalizes the maximum pitch autocorrelation function;
a noise segment/speech segment determination unit for determining whether an acquired signal segment is a speech segment or a noise segment, through use of the maximum normalized pitch autocorrelation function;
an autocorrelation function normalizing unit for obtaining a normalized autocorrelation function by means of dividing the autocorrelation function by R(0) when the noise segment/speech segment determination unit has rendered the signal segment a noise segment;
a normalized autocorrelation function vector address computation unit for performing computation to determine to which one of p-order normalized autocorrelation function vector spaces that have been assigned the normalized autocorrelation function vectors beforehand and divided beforehand the normalized autocorrelation vector pertains;
a normalized autocorrelation function count unit for counting the number of times normalized autocorrelation functions have arisen;
a normalized autocorrelation function storage unit for storing the normalized autocorrelation functions and their addresses as a normalized autocorrelation function vector (r(1), r(2), . . . r (p));
a normalized autocorrelation function vector region computation/determination unit which, when the number of normalized autocorrelation function vectors stored in the normalized autocorrelation function vector/region storage unit has reached a predetermined number, classifies a plurality of normalized autocorrelation function vectors into one or a plurality of noise vector regions, one or a plurality of speech vector regions, and undefined vectors and stores a result of classification into the normalized autocorrelation function vector/region storage unit;
determines to which, if any, of a plurality of noise vector regions the latest normalized autocorrelation function vector stored in the normalized autocorrelation function storage unit pertains;
determines the acquired signal segment as corresponding to a noise section when the vector pertains to one of the plurality of noise vector regions; and
determines the acquired signal segment as corresponding to a speech section when the vector does not pertain to any of the plurality of noise vector regions; and
a logical OR unit for producing a logical OR product from an output indicating that the normalized autocorrelation function vector region computation/determination unit has determined the signal segment to be a speech segment and from an output indicating that the noise segment/speech segment determination unit has determined the signal segment to be a speech segment, wherein the input signal segment is determined to be a noise segment or a speech segment, through use of a speech segment determination output from the logical OR unit and a noise segment determination output from the normalized autocorrelation function vector region computation/determination unit.
-
-
12. A noise segment/speech segment determination apparatus which determines whether an input signal segment is a noise segment or a speech segment, the apparatus comprising:
-
an analog-to-digital conversion unit for converting into a digital signal a speech signal having ambient noise superimposed thereon;
a data extraction unit for extracting the digital signal as segment data having a predetermined duration;
an autocorrelation function computation unit for computing an autocorrelation function of the extracted data, provided that an analysis order is taken up to a “
p-order,”
R(0), R(1), R(2), . . . R(p);
a data storage unit for storing the digital signal extracted by the data extraction unit;
a pitch autocorrelation function computation unit for computing a pitch autocorrelation function through use of a digital signal extracted by the data extraction unit and the data stored in the data storage unit;
a pitch autocorrelation function maximum value selection/normalization unit which selects the maximum pitch autocorrelation function and normalizes the maximum pitch autocorrelation function;
a first-order partial autocorrelation function computation unit for computing a first-order autocorrelation function k1 determined as a ratio of autocorrelation function R(1) to autocorrelation function R(0) computed by the autocorrelation function computation unit;
a noise segment/speech segment determination unit for determining whether an acquired signal segment is a speech segment or a noise segment, through use of the maximum normalized pitch autocorrelation function and a value of the first-order partial autocorrelation function (k1);
an autocorrelation function normalizing unit for obtaining a normalized autocorrelation function by means of dividing the autocorrelation function by R(0) when the noise segment/speech segment determination unit has rendered the signal segment a noise segment;
a normalized autocorrelation function vector address computation unit for performing computation to determine to which one of p-order normalized autocorrelation function vector spaces that have been assigned the normalized autocorrelation function vectors beforehand and divided beforehand the normalized autocorrelation vector pertains;
a normalized autocorrelation function count unit for counting the number of times normalized autocorrelation functions have arisen;
normalized autocorrelation function vector/region storage unit for storing the normalized autocorrelation function as normalized autocorrelation function vectors (r (1), r(2), . . . r(p)) along with their addresses;
a normalized autocorrelation function vector region computation/determination unit which, when the number of normalized autocorrelation function vectors stored in the normalized autocorrelation function vector/region storage unit has reached a predetermined number, classifies a plurality of normalized autocorrelation function vectors into one or a plurality of noise vector regions, one or a plurality of speech vector regions, and undefined vectors and stores a result of classification into the normalized autocorrelation function vector/region storage unit;
determines which, if any, of a plurality of noise vector regions the latest normalized autocorrelation function vector stored in the normalized autocorrelation function storage unit pertains;
determines the acquired signal segment as corresponding to a noise section when the vector pertains to one of the plurality of noise vector regions; and
determines the acquired signal segment as corresponding to a speech section when the vector does not pertain to any of the plurality of noise vector regions; and
a logical OR unit for producing a logical OR product from an output indicating that the normalized autocorrelation function vector region computation/determination unit has determined the signal segment to be a speech segment and from an output indicating that the noise segment/speech segment determination unit has determined the signal segment to be a speech segment, wherein the input signal segment is determined to be a noise segment or a speech segment, through use of a speech segment determination output from the logical OR unit and a noise segment determination output from the normalized autocorrelation function vector region computation/determination unit.
-
Specification