Speech recognition apparatus using neural network and fuzzy logic

US 5,040,215 A
Filed: 08/30/1989
Issued: 08/13/1991
Est. Priority Date: 09/07/1988
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition apparatus comprising:

input means for inputting speech;

feature extraction means for extracting feature vectors from the input speech in each of a series of predetermined times and for obtaining a feature vector series;

candidate selection means for selecting high-ranking candidates of recognition result by matching the feature vector series with various categories;

pair generation means for generating a plurality of pairs of candidates from the candidates selected by said candidate selection means;

pair discrimination means for discriminating between each candidate of each pair of selected candidates, wherein said pair discrimination means comprises neural network means for extracting several acoustic cues specific to a respective pair from the feature vector series, said neural network means having respectively suitable structures for extracting the several acoustic cues by setting up connection coefficients based on information stored in a first memory, and logic means for selecting the most certain one of the several acoustic cues based on extracted results of said neural network means; and

decision means for ranking the selected candidates based on a pair discrimination result of said pair discrimination means, thereby representing which candidate of the selected candidates corresponds to the input speech.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition apparatus has a speech input unit for inputting a speech; a speech analysis unit for analyzing the inputted speech to output the time series of a feature vector; a candidates selection unit for inputting the time series of a feature vector from the speech analysis unit to select a plurality of candidates of recognition result from the speech categories; and a discrimination processing unit for discriminating the selected candidates to obtain a final recognition result. The discrimination processing unit includes three components in the form of a pair generation unit for generating all of the two combinations of the n-number of candidates selected by said candidate selection unit, a pair discrimination unit for discriminating which of the candidates of the combinations is more certain for each of all _n C₂ -number of combinations (or pairs) on the basis of the extracted result of the acoustic feature intrinsic to each of said candidate speeches, and a final decision unit for collecting all the pair discrimination results obtained from the pair discrimination unit for each of all the _n C₂ -number of combinations (or pairs) to decide the final result. The pair discrimination unit handles the extracted result of the acoustic feature intrinsic to each of the candidate speeches as fuzzy information and accomplishes the discrimination processing on the basis of fuzzy logic algorithms, and the final decision unit accomplishes its collections on the basis of the fuzzy logic algorithms.

Citations

12 Claims

1. A speech recognition apparatus comprising:
- input means for inputting speech;
  
  feature extraction means for extracting feature vectors from the input speech in each of a series of predetermined times and for obtaining a feature vector series;
  
  candidate selection means for selecting high-ranking candidates of recognition result by matching the feature vector series with various categories;
  
  pair generation means for generating a plurality of pairs of candidates from the candidates selected by said candidate selection means;
  
  pair discrimination means for discriminating between each candidate of each pair of selected candidates, wherein said pair discrimination means comprises neural network means for extracting several acoustic cues specific to a respective pair from the feature vector series, said neural network means having respectively suitable structures for extracting the several acoustic cues by setting up connection coefficients based on information stored in a first memory, and logic means for selecting the most certain one of the several acoustic cues based on extracted results of said neural network means; and
  
  decision means for ranking the selected candidates based on a pair discrimination result of said pair discrimination means, thereby representing which candidate of the selected candidates corresponds to the input speech.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. A speech recognition apparatus according to claim 1, further comprising a first memory for storing connection coefficients of said neural network means to be set up in order to extract the several acoustic cues specified to the respective pair of the selected candidates.
  - 3. A speech recognition apparatus according to claim 1, further comprising membership function calculation means for transforming the extracted results of said neural network means into membership function values, wherein said logic means selects the most certain one of the several acoustic cues based on the transformed membership function value.
  - 4. A speech recognition apparatus according to claim 3, wherein said logic means comprises a fuzzy logic OR device for determining for each pair of selected candidates the maximum of the transformed membership values.
  - 5. A speech recognition apparatus according to claim 3, further comprising membership function calculation means for transforming the extracted results of said neural network means into membership function values, wherein said logic means selects the most certain one of the several acoustic cues based on the transformed membership function value.
  - 6. A speech recognition apparatus according to claim 5, wherein said logic means comprises a neural network for determining for each pair of the selected candidates the maximum of the transformed membership function values.
  - 7. A speech recognition apparatus according to claim 3, wherein said logic means comprises a fuzzy logic OR device for determining for each pair of the selected candidates the maximum of the transformed membership values.
  - 8. A speech recognition apparatus according to claim 1, wherein said logic means comprises a neural network for determining for each pair of the selected candidates the maximum of the transformed membership values.

9. A speech recognition apparatus comprising:
- input unit inputting speech and converting the input speed into a digital signal;
  
  a spectral analysis unit for extracting feature vectors from the digital signal of the input speech in each of a series of predetermined times and for obtaining a feature vector series;
  
  a candidate selection unit for selecting high-ranking candidates of various phonemes by matching the feature vector series with the various phonemes;
  
  a pair generator for generating a plurality of pairs of candidates from the candidates selected by said candidate selector;
  
  a pair discrimination unit for discriminating between each candidate of each pair of selected candidates, wherein said pair discrimination unit comprises neural networks for extracting several acoustic cues specific to the respective pair from the feature vector series, said neural networks having respectively suitable structures for extracting the several acoustic cues by setting up connection coefficients based on information stored in a first memory, and a fuzzy logic unit for selecting the most certain one of the several acoustic cues based on extracted results of said neural networks; and
  
  a decision unit ranking the selected candidates based on pair discrimination results of said pair discrimination unit, thereby representing which candidate of the selected candidates corresponds to the input speech.
- View Dependent Claims (10, 11, 12)
- - 10. A speech recognition apparatus according to claim 9, further comprising a membership function calculation unit for transforming the extracted results of said neural networks into membership function values.
  - 11. A speech recognition apparatus according to claim 10, wherein said fuzzy logic unit comprises a fuzzy OR device for determining for each pair of the selected candidates the maximum of the transformed membership function values.
  - 12. A speech recognition apparatus according to claim 11, wherein said decision unit comprises a fuzzy logic AND device for summing the transformed membership function values for each of the selected candidates and for ranking the selected candidates based on its summation result.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hitachi, Ltd.
Original Assignee
Hitachi, Ltd.
Inventors
Amano, Akio, Hataoka, Nobuo, Ichikawa, Akira
Primary Examiner(s)
Harkcom, Gary V.
Assistant Examiner(s)
NOT, DEFINED

Application Number

US07/400,342
Time in Patent Office

713 Days
Field of Search

381/41-46, 364/513, 364/513.5, 382/14-17, 382/30, 382/37-38, 382/39
US Class Current

704/232
CPC Class Codes

G10L 15/16 using artificial neural net...

Y10S 706/90 Fuzzy logic

Speech recognition apparatus using neural network and fuzzy logic

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition apparatus using neural network and fuzzy logic

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links