Method, device and program for extracting and recognizing voice

US 7,440,892 B2
Filed: 03/08/2005
Issued: 10/21/2008
Est. Priority Date: 03/11/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A method of extracting voice components from a digital voice signal containing a mixture of voice components and noise components, said method comprising:

extracting a plurality of kinds of signal components from the digital voice signal containing the mixture of voice components and noise components by using a plurality of digital band-pass filters;

forming a first synthesized signal by synthesizing, according to a first rule having a first probability density function, the extracted signal components, and forming a second synthesized signal by synthesizing, according to a second rule having a second probability density function different from the first rule, the extracted signal components, a respective difference between the first and second probability density functions and a Gaussian distribution being at a maximum; and

selectively producing a synthesized voice signal expressing the feature of the voice components from the first and second synthesized signals;

wherein the first and second rules are determined based on characteristic feature quantities of the first and second synthesized signals,wherein differences between the first synthesized signal and the second synthesized signal and the Gaussian distribution are evaluated to selectively output the one of the first and second synthesized signals having the greatest difference from the Gaussian distribution as the synthesized voice signal.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In a method of extracting voice components free of noise components from voice signals input through a single microphone, a signal-decomposing unit extracts independent signal components from the voice signals input through a single microphone by using a plurality of filters that permit the passage of signal components of different frequency bands. A signal-synthesizing unit synthesizes the signal components according to a first rule to form a first synthesized signal, and synthesizes the signal components according to a second rule to form a second synthesized signal. The first and second rules are so determined that a difference becomes a maximum between the probability density function of the first synthesized signal and the probability density function of the second synthesized signal. An output selection unit selectively produces a synthesized signal having a large difference from the Gaussian distribution between the synthesized signals.

29 Citations

View as Search Results

14 Claims

1. A method of extracting voice components from a digital voice signal containing a mixture of voice components and noise components, said method comprising:
- extracting a plurality of kinds of signal components from the digital voice signal containing the mixture of voice components and noise components by using a plurality of digital band-pass filters;
  
  forming a first synthesized signal by synthesizing, according to a first rule having a first probability density function, the extracted signal components, and forming a second synthesized signal by synthesizing, according to a second rule having a second probability density function different from the first rule, the extracted signal components, a respective difference between the first and second probability density functions and a Gaussian distribution being at a maximum; and
  
  selectively producing a synthesized voice signal expressing the feature of the voice components from the first and second synthesized signals;
  
  wherein the first and second rules are determined based on characteristic feature quantities of the first and second synthesized signals,wherein differences between the first synthesized signal and the second synthesized signal and the Gaussian distribution are evaluated to selectively output the one of the first and second synthesized signals having the greatest difference from the Gaussian distribution as the synthesized voice signal.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein the extracting of the plurality of kinds of signal components further comprises setting impulse responses of the plurality of filters so that the signal components extracted by the filters become one of independent from and uncorrelated with each other.
  - 3. The method of claim 1, wherein the digital band-pass filters include one of a FIR type filter and an IIR type filter.
  - 4. The method of claim 1, wherein the first and second rules are so determined that a statistic feature quantity representing the difference between the first and second probability density functions of the first and second synthesized signals becomes maximum.
  - 5. The method of claim 1, wherein the first and second rules include weighing and adding the signal components and the second synthesized signal.

6. An apparatus for selectively extracting voice components from a digital voice signal containing a mixture of voice components and noise components, the apparatus comprising:
- a plurality of digital band-pass filters;
  
  extract means for extracting a plurality of kinds of signal components from the mixture of voice components and noise components of the digital voice signal input from an external unit by using the plurality of digital band-pass filters;
  
  first synthesizing means for forming a first synthesized signal by synthesizing the signal components extracted by the extract means according to a first rule having a first probability density function;
  
  second synthesizing means for forming a second synthesized signal by synthesizing the signal components extracted by the extract means according to a second rule having a second probability density function different from the first rule, a difference between the first and second probability functions and the Gaussian distribution being at a maximum;
  
  selective output means for selectively producing a synthesized voice signal expressing the feature of the voice component from the first synthesized signal formed by the first synthesizing means and the second synthesized signal formed by the second synthesizing means; and
  
  determining means for determining the first and second rules based on a statistic feature quantity of the first synthesized signal formed by the first synthesizing means and of the second synthesized signal formed by the second synthesizing means,wherein differences between the first synthesized signal and the second synthesized signal and the Gaussian distribution are evaluated to selectively output one of the first synthesized signal and the second synthesized having the greatest difference from the Gaussian distribution as the synthesized voice signal.
- View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
- - 7. The apparatus for selectively extracting according to claim 6, wherein the extract means sets the impulse responses of the plurality of digital band-pass filters such that the signal components extracted by the filters become one of independent from and uncorrelated with each other, and extracts the plurality of kinds of signal components from the digital voice signals by using the plurality of digital band-pass filters.
  - 8. The apparatus for selectively extracting according to claim 6, wherein the digital band-pass filters are one of an FIR type or an IIR type.
  - 9. The apparatus for selectively extracting according to claim 6, wherein the first and second rules are so determined that a quantity expressing the difference between the first and second probability density functions of the first and second synthesized signals becomes a maximum.
  - 10. The apparatus for selectively extracting according to claim 6, wherein the statistic feature quantity includes a mutual data quantity and the first and second rules are determined so that the mutual data quantity for the first and second synthesized signals becomes a minimum.
  - 11. The apparatus for selectively extracting according to claim 6, wherein:
    - the determining means determines that the first and second rules are associated with weighing the signal components extracted by the extract means;
      
      the first synthesizing means weighs and adds up the signal components extracted by the extract means according to the first rule to form the first synthesized signal; and
      
      the second synthesizing means weighs and adds up the signal components extracted by the extract means according to the second rule to form the second synthesized signal.
  - 12. The apparatus for selectively extracting according to claim 6, wherein the selective output means includes evaluation means for evaluating differences between the first synthesized signal formed by the first synthesizing means and the second synthesized signal formed by the second synthesizing means and the Gaussian distribution, and wherein the one of the first synthesized signal and the second synthesized signal evaluated by the evaluation means to possess the greatest difference from the Gaussian distribution is selectively output as the synthesized voice signal expressing the feature of the voice component.
  - 13. An apparatus for selectively extracting according to claim 6, further comprising a voice recognition means wherein the voice is recognized by using the synthesized voice signal produced by the selective output means.

14. An article of manufacture comprising:
- a computer readable medium; and
  
  instructions carried on the computer readable medium, the instructions for selectively extracting voice components from a digital voice signal containing a mixture of voice components and noise components, the instructions, when read and executed by a computer, for causing the computer to function as;
  
  a plurality of digital band-pass filters;
  
  extract means for extracting a plurality of kinds of signal components from the digital voice signal containing the mixture of voice components and noise components input from an external unit by using said plurality of filtersfirst synthesizing means for forming a first synthesized signal by synthesizing the signal components extracted by said extract means according to a first rule having a first probability density function;
  
  second synthesizing means for forming a second synthesized signal by synthesizing the signal components extracted by said extract means according to a second rule having a second probability density function different from the first rule, respective difference between the first and second probability density functions and a Gaussian distribution being at a maximum; and
  
  selective output means for selectively producing a synthesized voice signal expressing the feature of the voice component based on the first synthesized signal formed by the first synthesizing means and the second synthesized signal formed by the second synthesizing means; and
  
  determining means for determining the first and second rules based on the statistic feature quantity of the first synthesized signal formed by the first synthesizing means and of the second synthesized signal formed by the second synthesizing meanswherein the respective differences between the first synthesized signal and the second synthesized signal and the Gaussian distribution are evaluated, and the synthesized signal evaluated to have the greatest difference from the Gaussian distribution is selectively output as the synthesized voice signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
DENSO Corporation
Original Assignee
DENSO Corporation
Inventors
Tamura, Shinichi
Primary Examiner(s)
Smits; Talivaldis Ivars
Assistant Examiner(s)
Colucci; Michael C

Application Number

US11/073,922
Publication Number

US 20050203744A1
Time in Patent Office

1,323 Days
Field of Search

704/208, 704/219, 704/233, 704/255, 704/205, 704/226
US Class Current

704/233
CPC Class Codes

G10L 21/02 Speech enhancement, e.g. no...

Method, device and program for extracting and recognizing voice

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

29 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Method, device and program for extracting and recognizing voice

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links