Speech-processing apparatus and speech-processing method
First Claim
Patent Images
1. A speech-processing apparatus, comprising:
- a processor configured to;
localize a sound source based on an acquired speech signal; and
perform speech zone detection in which a speech start and a speech end are detected based on localization information of the localized sound source,wherein the processor is configured to perform the speech zone detection by using a plurality of threshold values with respect to the localized speech signal, and whereinthe processor is configured to;
detect a sound source candidate by using a first threshold value of the plurality of threshold values with respect to the localized speech signal,perform a clustering process on the detected sound source candidate, andperform the speech zone detection in which a speech start and a speech end are detected, by using a second threshold value that is larger than the first threshold value of the plurality of threshold values for each cluster classified by the clustering process.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech-processing apparatus includes: a sound source localization unit that localizes a sound source based on an acquired speech signal; and a speech zone detection unit that performs speech zone detection based on localization information localized by the sound source localization unit.
-
Citations
6 Claims
-
1. A speech-processing apparatus, comprising:
-
a processor configured to; localize a sound source based on an acquired speech signal; and perform speech zone detection in which a speech start and a speech end are detected based on localization information of the localized sound source, wherein the processor is configured to perform the speech zone detection by using a plurality of threshold values with respect to the localized speech signal, and wherein the processor is configured to; detect a sound source candidate by using a first threshold value of the plurality of threshold values with respect to the localized speech signal, perform a clustering process on the detected sound source candidate, and perform the speech zone detection in which a speech start and a speech end are detected, by using a second threshold value that is larger than the first threshold value of the plurality of threshold values for each cluster classified by the clustering process. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A speech-processing method, comprising:
-
(a) localizing a sound source based on an acquired speech signal; (b) performing speech zone detection in which a speech start and a speech end are detected based on localization information of the sound source localized in (a); and (c) performing the speech zone detection by using a plurality of threshold values with respect to the speech signal localized in (a), wherein in (c), a sound source candidate is detected by using a first threshold value of the plurality of threshold values with respect to the localized speech signal, a clustering process is performed on the detected sound source candidate, and the speech zone detection in which a speech start and a speech end are detected is performed by using a second threshold value that is larger than the first threshold value of the plurality of threshold values for each cluster classified by the clustering process.
-
Specification