Speech recognizing apparatus and speech recognizing method

US 20030125943A1
Filed: 12/27/2002
Published: 07/03/2003
Est. Priority Date: 12/28/2001
Status: Active Grant

First Claim

Patent Images

1. A speech recognizing apparatus comprising:

a speech section detecting unit to detect a speech section from an input signal;

a characteristic amount extracting unit to analyze an input speech, which is the input signal in said speech section, and extracting a time series of the amount of characteristics representing characteristics of the input speech;

a recognizing target vocabulary storing unit to store predetermined recognizing target vocabularies;

a recognizing target vocabulary comparing unit to compare the time series of the amount of characteristics with respective recognizing target vocabularies read from the recognizing target vocabulary storing unit one by one to obtain a likelihood that respective recognizing target vocabularies coincide with the time series of the amount of characteristics;

an environment adaptive noise model storing unit to store an environment adaptive noise models adapted to an environmental noise;

an environment adaptive noise model comparing unit to compare the time series of the amount of characteristics with respective environment adaptive noise models read from the environment adaptive noise model storing unit one by one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics;

a rejection determining unit to determine whether or not the input signal is a noise by comparing the likelihood of the registered vocabulary obtained by said recognizing target vocabulary comparing unit with the likelihood of the environmental noise obtained by said environment adaptive noise model comparing unit; and

a noise model adapting unit to update the environment adaptive noise model so as to adapt to the input signal when said rejection determining unit determines that the input signal is the noise.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A recognizing target vocabulary comparing unit calculates a compared likelihood of a recognizing target vocabulary, i.e., a compared likelihood of a registered vocabulary, by using the time series of the amount of characteristics of an input speech. An environment adaptive noise model comparing unit calculates a compared likelihood of a noise model adaptive to a noise environment, i.e., a compared likelihood of environmental noise. A rejection determining unit compares the likelihood of the registered vocabulary with the likelihood of the environmental noise, and determines whether or not the input speech is the noise. When it is determined that the input speech is the noise, a noise model adapting unit adaptively updates an environment adaptive noise model by using the input speech. Thus, the environment adaptive noise model matches to a real environment and the rejection determination can be performed for a noise input with high accuracy.

29 Citations

View as Search Results

11 Claims

1. A speech recognizing apparatus comprising:
- a speech section detecting unit to detect a speech section from an input signal;
  
  a characteristic amount extracting unit to analyze an input speech, which is the input signal in said speech section, and extracting a time series of the amount of characteristics representing characteristics of the input speech;
  
  a recognizing target vocabulary storing unit to store predetermined recognizing target vocabularies;
  
  a recognizing target vocabulary comparing unit to compare the time series of the amount of characteristics with respective recognizing target vocabularies read from the recognizing target vocabulary storing unit one by one to obtain a likelihood that respective recognizing target vocabularies coincide with the time series of the amount of characteristics;
  
  an environment adaptive noise model storing unit to store an environment adaptive noise models adapted to an environmental noise;
  
  an environment adaptive noise model comparing unit to compare the time series of the amount of characteristics with respective environment adaptive noise models read from the environment adaptive noise model storing unit one by one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics;
  
  a rejection determining unit to determine whether or not the input signal is a noise by comparing the likelihood of the registered vocabulary obtained by said recognizing target vocabulary comparing unit with the likelihood of the environmental noise obtained by said environment adaptive noise model comparing unit; and
  
  a noise model adapting unit to update the environment adaptive noise model so as to adapt to the input signal when said rejection determining unit determines that the input signal is the noise.
- View Dependent Claims (2)
- - 2. The speech recognizing apparatus according to claim 1, wherein said noise model adapting unit matches the environment adaptive noise model to the amount of characteristics extracted from the input signal.

3. A speech recognizing apparatus comprising:
- a speech section detecting unit to detect a speech section from an input signal;
  
  a characteristic amount extracting unit to analyze an input speech, which is the input signal in said speech section, and extracting the time series of the amount of characteristics representing the characteristics of the input speech;
  
  a recognizing target vocabulary storing unit to store predetermined recognizing target vocabularies;
  
  a recognizing target vocabulary comparing unit to compare the time series of the amount of characteristics with respective recognizing target vocabularies read from the recognizing target vocabulary storing unit one by one to obtain a likelihood that respective recognizing target vocabularies coincide with the time series of the amount of characteristics;
  
  a recognizing-unit standard pattern storing unit to store recognizing-unit standard patterns;
  
  an environment adaptive recognizing-unit selecting unit to select at least one recognizing-unit standard patterns adaptive to an environmental noise, stored in said recognizing-unit standard pattern storing unit;
  
  an environment adaptive noise model comparing unit to compare the time series of the amount of characteristics with one recognizing-unit standard pattern or with two or more combined recognizing-unit standard patterns, selected by said environment adaptive recognizing-unit selecting unit one by one, to obtain a likelihood that said respective environment adaptive noise models coincide with the time series of characteristics; and
  
  a rejection determining unit to determine whether or not the input signal is a noise based on the likelihood obtained by said recognizing target vocabulary comparing unit and the likelihood obtained by said environment adaptive noise model comparing unit, wherein said environment adaptive recognizing-unit selecting unit selects again the recognizing-unit standard pattern stored in said recognizing-unit standard pattern storing unit so as to adapt to the input signal when said rejection determining unit determines that the input signal is the noise.
- View Dependent Claims (4, 5, 6, 7)
- - 4. The speech recognizing apparatus according to claim 3, wherein the recognizing unit standard pattern is a phoneme model.
  - 5. The speech recognizing apparatus according to claim 3, wherein said rejection determining unit determines whether or not the input signal is the noise by comparing the likelihood obtained by said recognizing target vocabulary comparing unit with the likelihood obtained by said environment adaptive noise model comparing unit.
  - 6. The speech recognizing apparatus according to claim 3, said rejection determining unit comprises:
    - a first determining unit to determine the rejection by using the likelihood obtained by said recognizing target vocabulary comparing unit; and
      
      a second determining unit to determine the rejection by using the likelihood obtained by said environment adaptive noise model comparing unit.
  - 7. The speech recognizing apparatus according to claim 6, wherein said first determining unit determines a rejection by comparing the likelihood obtained by said recognizing target vocabulary comparing unit with a predetermined threshold, and said second determining unit determines a rejection of the input signal which is determined as the speech by said first determining unit by using the likelihood obtained by said environment adaptive noise model comparing unit.

8. A speech recognizing method comprising:
- a speech section detecting step of detecting a speech section from an input signal;
  
  a characteristic amount extracting step of analyzing an input speech, which is the input signal in said speech section, and extracting the time series of the amount of characteristics representing characteristics of the input speech;
  
  a recognizing target vocabulary comparing step of comparing the time series of the amount of characteristics with respective recognizing target vocabularies read from a recognizing target vocabulary storing unit to store predetermined recognizing target vocabularies one by one to obtain a likelihood that respective recognizing target vocabularies coincide with the time series of the amount of characteristics;
  
  an environment adaptive noise model comparing step of comparing the time series of the amount of characteristics with respective environment adaptive noise models read from the environment adaptive noise model storing unit to store predetermined environment adaptive noise models one by one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics;
  
  a rejection determining step of determining whether or not the input signal is a noise by comparing the likelihood obtained by said recognizing target vocabulary comparing step with the likelihood obtained by environment adaptive noise model comparing step; and
  
  a noise model adapting step of updating the environment adaptive noise model so as to adapt to the input signal when it is determined that the input signal is the noise.

9. A speech recognizing method comprising:
- a speech section detecting step of detecting a speech section from an input signal;
  
  a characteristic amount extracting step of analyzing an input speech, which is the input signal in said speech section, and extracting the time series of the amount of characteristics of the input speech;
  
  a recognizing target vocabulary comparing step of comparing the time series of the amount of characteristics with respective recognizing target vocabularies read from a recognizing target vocabulary storing unit to store predetermined recognizing target vocabularies one by one to obtain a likelihood that respective recognizing target vocabularies coincide with the time series of the amount of characteristics;
  
  an environment adaptive recognizing-unit selecting step of selecting at least one recognizing-unit standard patterns adaptive to an environmental noise from a recognizing-unit standard pattern storing unit to store recognizing-unit standard patterns;
  
  an environment adaptive noise model comparing step for comparing the time series of the amount of characteristics with one recognizing-unit standard pattern or with two or more combined recognizing-unit standard patterns, selected by said environment adaptive recognizing-unit selecting step one by one, to obtain a likelihood that said respective environment adaptive noise models coincide with the time series of characteristics;
  
  a rejection determining step of determining whether or not the input signal is a noise based on the likelihood obtained by the environment adaptive recognizing-unit selecting step and the likelihood obtained by the environment adaptive noise model comparing step; and
  
  a step of selecting again the recognizing-unit standard pattern stored in said recognizing-unit standard pattern storing unit so as to adapt to the input signal when it is determined that the input signal is the noise.

10. A speech recognizing program product for allowing a computer to execute:
- speech section detecting processing for detecting a speech section from an input signal;
  
  characteristic amount extracting processing for analyzing an input speech, which is the input signal in said speech section, and extracting the time series of the amount of characteristics representing characteristics of the input speech;
  
  a recognizing target vocabulary comparing processing of comparing the time series of the amount of characteristics with respective recognizing target vocabularies read from a recognizing target vocabulary storing unit to store predetermined recognizing target vocabularies one by one to obtain a likelihood that respective recognizing target vocabularies coincide with the time series of the amount of characteristics;
  
  an environment adaptive noise model comparing processing of comparing the time series of the amount of characteristics with respective environment adaptive noise models read from the environment adaptive noise model storing unit to store predetermined environment adaptive noise models one by one to obtain a likelihood that respective environment adaptive noise models coincide with the time series of the amount of characteristics;
  
  rejection determining processing for determining whether or not the input signal is a noise by comparing the likelihood obtained by recognizing target vocabulary comparing processing with the likelihood obtained by environment adaptive noise model comparing processing; and
  
  noise model adapting processing for updating the environment adaptive noise model so as to adapt to the input signal when it is determined that the input signal is the noise.

11. A speech recognizing program product for allowing a computer to execute:
- speech section detecting processing for detecting a speech section from an input signal;
  
  characteristic amount extracting processing for analyzing an input speech, which is the input signal in said speech section, and extracting the time series of the amount of characteristics representing the characteristics of the input speech;
  
  a recognizing target vocabulary comparing processing of comparing the time series of the amount of characteristics with respective recognizing target vocabularies read from a recognizing target vocabulary storing unit to store predetermined recognizing target vocabularies one by one to obtain a likelihood that respective recognizing target vocabularies coincide with the time series of the amount of characteristics;
  
  an environment adaptive recognizing-unit selecting processing of selecting at least one recognizing-unit standard patterns adaptive to an environmental noise from a recognizing-unit standard pattern storing unit to store recognizing-unit standard patterns;
  
  an environment adaptive noise model comparing processing for comparing the time series of the amount of characteristics with one recognizing-unit standard pattern or with two or more combined recognizing-unit standard patterns, selected by said environment adaptive recognizing-unit selecting step one by one, to obtain a likelihood that said respective environment adaptive noise models coincide with the time series of characteristics;
  
  rejection determining processing for determining whether or not the input signal is a noise based on the likelihood obtained by recognizing target vocabulary comparing processing and the likelihood obtained by environment adaptive noise model comparing processing; and
  
  processing for selecting again the recognizing-unit standard pattern stored in said recognizing-unit standard pattern storing unit so as to adapt to the input signal when it is determined that the input signal is the noise.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Original Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Inventors
Koshiba, Ryosuke

Granted Patent

US 7,260,527 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/238
CPC Class Codes

G10L 15/20 Speech recognition techniqu...

Speech recognizing apparatus and speech recognizing method

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

29 Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognizing apparatus and speech recognizing method

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links