Speech recognition system allows new vocabulary words to be added without requiring spoken samples of the words

US 5,623,578 A
Filed: 10/28/1993
Issued: 04/22/1997
Est. Priority Date: 10/28/1993
Status: Expired due to Fees

First Claim

Patent Images

1. In a computer system, a speech recognition method comprising the steps of:

a) receiving a user spoken word (USW);

b) generating score parameters for each of a plurality of first phoneme strings by comparing output values of each against the USW;

c) selecting one of the first phoneme strings having a best correlation to the USW based on said score parameters, said one phoneme string corresponding to a first word in a stored database;

d) generating a decision field having a first region that contains a first set of response signals and a second region that contains to a second set of response signals, said first set of response signals including response signals obtained by exciting said one phoneme string, said second set of response signals obtained by exciting a second string of phonemes that differs from said one phoneme string;

e) generating a third response signal based on exciting said one phoneme string with the USW;

f) determining whether said USW is a valid input of the first word based on a comparison of said third response signal to said decision field, said USW comprising a valid input of the first word if said third response signal is within said first region and an invalid input of the first word if said third response signal is within said second region.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition method implemented in a computer system recognizes words without requiring prior creation of models for such words based on spoken entries. A key word is entered in nonspoken form and a string of phonemes are defined by the speech recognizer to represent the new key word. A response signal is generated from each phoneme in the new key word model. Such response signals are utilized to define a multidimensional validity field for the new key word. Upon receipt of a spoken word from a user, a string of phonemes is assigned to represent the spoken word. A response signal from each phoneme in the model used to represent the spoken word is contrasted with the validity fields previously defined for the corresponding key word. A determination is made as to whether the spoken word is valid or not based on whether the response signals representing the spoken word lie within the validity fields.

Citations

18 Claims

1. In a computer system, a speech recognition method comprising the steps of:
- a) receiving a user spoken word (USW);
  
  b) generating score parameters for each of a plurality of first phoneme strings by comparing output values of each against the USW;
  
  c) selecting one of the first phoneme strings having a best correlation to the USW based on said score parameters, said one phoneme string corresponding to a first word in a stored database;
  
  d) generating a decision field having a first region that contains a first set of response signals and a second region that contains to a second set of response signals, said first set of response signals including response signals obtained by exciting said one phoneme string, said second set of response signals obtained by exciting a second string of phonemes that differs from said one phoneme string;
  
  e) generating a third response signal based on exciting said one phoneme string with the USW;
  
  f) determining whether said USW is a valid input of the first word based on a comparison of said third response signal to said decision field, said USW comprising a valid input of the first word if said third response signal is within said first region and an invalid input of the first word if said third response signal is within said second region.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method according to claim 1 wherein second string of phonemes comprises randomly selected phonemes from a table of phonemes stored in the database.
  - 3. The method according to claim 2 wherein said one phoneme string consists of X phonemes, said second string of phonemes consisting of X randomly selected phonemes from a table of phonemes stored in the database.
  - 4. The method according to claim 1 wherein said first and second sets of response signals have a score and time duration component.
  - 5. The method according to claim 1 wherein said first and second regions of the validity field represent multidimensional parameters.
  - 6. The method according to claim 1 wherein the step of generating the decision field occurs prior to receiving the USW.
  - 7. The method according to claim 6 wherein a plurality of said decision fields are stored in the database.
  - 8. The method according to claim 1 further comprising the steps of:
    - receiving a new key word (NKW) in non-spoken form to be recognized by the system;
      
      identifying a string of phonemes to represent said NKW;
      
      storing said NKW and its associated string of phonemes in the database.
  - 9. The method according to claim 8 wherein said NKW is received as alphanumeric characters.

10. A speech recognition system comprising:
- a) means for receiving a user spoken word (USW);
  
  b) means for generating score parameters for each of a plurality of first phoneme strings by comparing output values of each against the USW;
  
  c) means for selecting one of the first phoneme strings having a best correlation to the USW based on said score parameters, said one phoneme string corresponding to a first word in a stored database;
  
  d) means for generating a decision field having a first region that contains a first set of response signals and a second region that contains to a second set of response signals, said first set of response signals including response signals obtained by exciting said one phoneme string, said second set of response signals obtained by exciting a second string of phonemes that differs from said one phoneme string;
  
  e) means for generating a third response signal based on exciting said one phoneme string with the USW;
  
  f) means for determining whether said USW is a valid input of the first word based on a comparison of said third response signal to said decision field, said USW comprising a valid input of the first word if said third response signal is within said first region and an invalid input of the first word if said third response signal is within said second region.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The system according to claim 10 wherein second string of phonemes comprises randomly selected phonemes from a table of phonemes stored in the database.
  - 12. The system according to claim 11 wherein said one phoneme string consists of X phonemes, said second string of phonemes consisting of X randomly selected phonemes from a table of phonemes stored in the database.
  - 13. The system according to claim 10 wherein said first and second sets of response signals have a score and time duration component.
  - 14. The system according to claim 10 wherein said first and second regions of the validity field represent multidimensional parameters.
  - 15. The system according to claim 10 wherein the means of generating the decision field generates the decision field prior to receiving the USW.
  - 16. The system according to claim 15 further comprising means for storing a plurality of said decision fields in the database.
  - 17. The system according to claim 10 further comprising:
    - means for receiving a new key word (NKW) in non-spoken form to be recognized by the system;
      
      means for identifying a string of phonemes to represent said NKW;
      
      means for storing said NKW and its associated string of phonemes in the database.
  - 18. The system according to claim 17 wherein said means for receiving the NKW receives the NKW as alphanumeric characters.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Mikkilineni, Rajendra P.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
Dorvil, Richemond

Application Number

US08/144,961
Time in Patent Office

1,272 Days
Field of Search

395/2.64, 395/2.65, 395/2.6, 395/2.41, 395/2.49, 381/41, 381/42, 381/43
US Class Current

704/255
CPC Class Codes

G10L 15/063   Training

G10L 2015/0635   updating or merging of old ...

G10L 2015/0638   Interactive procedures

Speech recognition system allows new vocabulary words to be added without requiring spoken samples of the words

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition system allows new vocabulary words to be added without requiring spoken samples of the words

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links