Speech recognition with preliminary matching

US 4,618,983 A
Filed: 12/22/1982
Issued: 10/21/1986
Est. Priority Date: 12/25/1981
Status: Expired due to Fees

First Claim

Patent Images

1. A method of recognizing audible speech made up of a plurality of speech patterns comprising:

(a) providing a plurality of reference speech patterns, each said reference speech pattern being divided into a predetermined number of feature frames;

(b) introducing a speech signal including at least one speech pattern to be recognized;

(c) dividing said speech pattern to be recognized into said predetermined number of feature frames, said frames of said speech pattern to be recognized corresponding in time to said frames of each said reference pattern;

(d) comparing said speech pattern to be recognized to said reference pattern, in only a portion of said predetermined number of feature frames to determine a coarse indication of relative correlation therebetween;

(e) ranking said reference patterns as to relative degree of correlation with speech patterns to be recognized as represented by said coarse indication developed by said step of comparing, the highest ranking indicating the highest coarse indication of related correlation;

(f) determining a final degree of correlation between only the highest ranked of said reference patterns and said speech pattern to be recognized to determine which reference pattern corresponds to said speech pattern to be recognized.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech recognition executes the final matching operation for each characteristic pattern according to the preliminarily selected order prior to the final matching, where any further calculation in reference to such a characteristic pattern is not executed after a specific condition has been satisfied during the matching calculation, and as result, calculation requirements can be drastically minimized so that the speech recognition can eventually be implemented at higher speed and yet under a sharply reduced cost.

Citations

8 Claims

1. A method of recognizing audible speech made up of a plurality of speech patterns comprising:
- (a) providing a plurality of reference speech patterns, each said reference speech pattern being divided into a predetermined number of feature frames;
  
  (b) introducing a speech signal including at least one speech pattern to be recognized;
  
  (c) dividing said speech pattern to be recognized into said predetermined number of feature frames, said frames of said speech pattern to be recognized corresponding in time to said frames of each said reference pattern;
  
  (d) comparing said speech pattern to be recognized to said reference pattern, in only a portion of said predetermined number of feature frames to determine a coarse indication of relative correlation therebetween;
  
  (e) ranking said reference patterns as to relative degree of correlation with speech patterns to be recognized as represented by said coarse indication developed by said step of comparing, the highest ranking indicating the highest coarse indication of related correlation;
  
  (f) determining a final degree of correlation between only the highest ranked of said reference patterns and said speech pattern to be recognized to determine which reference pattern corresponds to said speech pattern to be recognized.

2. A method of recognizing audible speech made up of a plurality of speech patterns comprising:
- (a) providing a plurality of reference speech patterns, each said reference speech pattern being divided into a predetermined number of features;
  
  (b) introducing a speech signal including at least one speech pattern to be recognized;
  
  (c) dividing said speech pattern to be recognized into a number of feature frames;
  
  (d) comparing only a portion of said speech patterns to be recognized to said reference portion patterns to determine a coarse indication of relative correlation therebetween;
  
  (e) ranking said reference patterns as to relative degree of correlation with speech patterns to be recognized as represented by said course indication developed by said step (d), the highest ranking indicating the highest coarse indication of relative correlation;
  
  (f) storing a plurality of said reference patterns having the highest rankings as indicated by said coarse indications of relative correlation in a memory in order of their ranking;
  
  (g) storing a threshold similarity value;
  
  (h) correlating each said ranked reference pattern to said speech pattern to be recognized and developing a calculated similarity value indicative of the correlation therebetween;
  
  (i) comparing said stored threshold similarity value with said calculated similarity value;
  
  (j) replacing said stored similarity value with said calculated similarity value to develop a new stored similarity value if said calculated similarity value indicates greater correlation than said stored similarity value;
  
  (k) discarding each said ranked reference pattern as a possible recognized pattern when its calculated similarity value indicates a lower degree of correlation than said stored similarity value;
  
  (l) repeating said steps h-k for each said ranked reference pattern stored in said memory to determine which reference pattern corresponds to said speech pattern to be recognized.

3. A method of recognizing audible speech made up of a plurality of speech patterns comprising:
- (a) providing a plurality of reference speech patterns, each said reference speech pattern being divided into a predetermined number of feature frames;
  
  (b) introducing a speech signal including at least one speech pattern to be recognized;
  
  (c) dividing said speech pattern to be recognized into said predetermined number of feature frames, said frames of said speech pattern to be recognized corresponding in time to said frames of each said reference pattern;
  
  (d) comparing said speech pattern to be recognized to said reference patterns in only a portion of said predetermined number of feature frames to determine a coarse indication of relative correlation therebetween;
  
  (e) ranking said reference patterns as to relative degree of correlation with the speech pattern to be recognized, as represented by said coarse indication developed by said step of comparing, the highest ranking indicating the highest coarse indication of relative correlation;
  
  (f) storing a plurality of said reference patterns having the highest rankings as indicated by said coarse indications of relative correlation in a memory in order of their ranking;
  
  (g) storing a threshold similarity value;
  
  (h) correlating a said ranked reference pattern to said speech pattern to be recognized and developing a calculated similarity value indicative of the correlation therebetween;
  
  (i) comparing said stored threshold similarity value with said calculated similarity value;
  
  (j) replacing said stored similarity value with said calculated similarity value to develop a new stored similarity value if said calculated similarity value indicates greater correlation than said stored similarity value;
  
  (k) discarding each said ranked reference pattern as a possible recognized pattern when its calculated similarity value indicates a lower degree of correlation then said stored similarity value;
  
  (l) repeating said steps h-k for each said ranked reference pattern stored in said memory to determine which said ranked reference pattern corresponds to the introduced speech pattern to be recognized.
- View Dependent Claims (4, 5, 6, 7, 8)
- - 4. The method of claim 3 further comprising executing final recognition of said introduced speech pattern to be recognized by outputting the reference pattern associated with threshold similarity value that is stored after all of said ranked reference patterns stored in memory are compared thereto.
  - 5. The method of claim 3 wherein said steph h comprises the steps of:
    - (h1) determining a frame similarity value indicative of the correlation between a frame of the said ranked reference pattern and a corresponding frame of said speech pattern to be recognized;
      
      (h2) accumulating said frame similarity values to develop said calculated similarity value;
      
      (h3) repeating steps h1 and h2 until said calculated similarity value either indicates that said similarity is lower than said stored threshold similarity value or until all said frames have a frame similarity value calculated therefor.
  - 6. The method of claim 3 wherein said ranked reference pattern is discarded by step k as soon as sufficient frames have been correlated to demonstrate that said ranked reference pattern will have a lower calculated similarity than the degree of similarity represented by said stored similarity value.
  - 7. The method of claim 3 wherein said similarity values are directly related to the degree of similarity.
  - 8. The method of claim 3 wherein said similarity values are directly related to the degree of matching error.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sharp Kabushiki Kaisha (Hon Hai Precision Industry Co., Ltd.)
Original Assignee
Sharp Kabushiki Kaisha (Hon Hai Precision Industry Co., Ltd.)
Inventors
Hakaridani, Mitsuhiro, Nishioka, Yoshiki, Iwahashi, Hiroyuki
Primary Examiner(s)
KEMENY, EMANUEL

Application Number

US06/452,298
Time in Patent Office

1,399 Days
Field of Search

381/42-43, 364/513, 364/513.5
US Class Current

704/239
CPC Class Codes

G10L 15/12 using dynamic programming t...

Speech recognition with preliminary matching

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition with preliminary matching

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links