Speech-processing apparatus and speech-processing method

US 10,002,623 B2
Filed: 07/29/2016
Issued: 06/19/2018
Est. Priority Date: 09/28/2015
Status: Active Grant

First Claim

Patent Images

1. A speech-processing apparatus, comprising:

a processor configured to;

localize a sound source based on an acquired speech signal; and

perform speech zone detection in which a speech start and a speech end are detected based on localization information of the localized sound source,wherein the processor is configured to perform the speech zone detection by using a plurality of threshold values with respect to the localized speech signal, and whereinthe processor is configured to;

detect a sound source candidate by using a first threshold value of the plurality of threshold values with respect to the localized speech signal,perform a clustering process on the detected sound source candidate, andperform the speech zone detection in which a speech start and a speech end are detected, by using a second threshold value that is larger than the first threshold value of the plurality of threshold values for each cluster classified by the clustering process.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech-processing apparatus includes: a sound source localization unit that localizes a sound source based on an acquired speech signal; and a speech zone detection unit that performs speech zone detection based on localization information localized by the sound source localization unit.

Citations

6 Claims

1. A speech-processing apparatus, comprising:
- a processor configured to;
  
  localize a sound source based on an acquired speech signal; and
  
  perform speech zone detection in which a speech start and a speech end are detected based on localization information of the localized sound source,wherein the processor is configured to perform the speech zone detection by using a plurality of threshold values with respect to the localized speech signal, and whereinthe processor is configured to;
  
  detect a sound source candidate by using a first threshold value of the plurality of threshold values with respect to the localized speech signal,perform a clustering process on the detected sound source candidate, andperform the speech zone detection in which a speech start and a speech end are detected, by using a second threshold value that is larger than the first threshold value of the plurality of threshold values for each cluster classified by the clustering process.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The speech-processing apparatus according to claim 1, wherein the processor is configured to:
    - detect a sound source candidate by using a second threshold value of the plurality of threshold values with respect to the localized speech signal,perform a clustering process on the detected sound source candidate, andperform the speech zone detection in which a speech start and a speech end are detected, by using a first threshold value that is smaller than the second threshold value of the plurality of threshold values for each cluster classified by the clustering process.
  - 3. The speech-processing apparatus according to claim 1, wherein the processor is configured to perform the speech zone detection in which a speech start and a speech end are detected, based on a gradient of a spatial spectrum of the localized speech signal.
  - 4. The speech-processing apparatus according to claim 1, wherein the processor is configured to:
    - perform sound source separation based on the acquired speech signal;
      
      perform sound source identification based on the separated separation signal; and
      
      detect, when the identified result is speech, that speech is continued in a zone.
  - 5. The speech-processing apparatus according to claim 1, wherein the processor is configured to:
    - detect a sound source candidate by using a threshold value with respect to the localized speech signal,acquire event information indicating that an event which causes noise with respect to the speech signal is occurring,generate a mask for a sound source candidate detected by using the threshold value based on the acquired event information, andperform the speech zone detection in which a speech start and a speech end are detected, by using the mask generated for the sound source candidate.

6. A speech-processing method, comprising:
- (a) localizing a sound source based on an acquired speech signal;
  
  (b) performing speech zone detection in which a speech start and a speech end are detected based on localization information of the sound source localized in (a); and
  
  (c) performing the speech zone detection by using a plurality of threshold values with respect to the speech signal localized in (a), wherein in (c),a sound source candidate is detected by using a first threshold value of the plurality of threshold values with respect to the localized speech signal,a clustering process is performed on the detected sound source candidate, andthe speech zone detection in which a speech start and a speech end are detected is performed by using a second threshold value that is larger than the first threshold value of the plurality of threshold values for each cluster classified by the clustering process.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Original Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Inventors
Nakamura, Keisuke, Nakadai, Kazuhiro
Primary Examiner(s)
Pham, Thierry L

Application Number

US15/223,478
Publication Number

US 20170092298A1
Time in Patent Office

690 Days
Field of Search

704226, 704233, 704235, 704246
US Class Current
CPC Class Codes

G01S 3/80   using ultrasonic, sonic or ...

G10K 11/1754   Speech masking

G10L 2025/783   based on threshold decision

G10L 21/0272   Voice signal separating

G10L 25/18   the extracted parameters be...

G10L 25/84   for discriminating voice fr...

G10L 25/93   Discriminating between voic...

Speech-processing apparatus and speech-processing method

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Speech-processing apparatus and speech-processing method

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links