Methods and apparatus for improving the reliability of recognizing words in a large database when the words are spelled or spoken

US 5,748,840 A
Filed: 05/09/1995
Issued: 05/05/1998
Est. Priority Date: 12/03/1990
Status: Expired due to Fees

First Claim

Patent Images

1. A method for identifying any one of a plurality of words using a programmed digital data processing system, each word having an audible form represented by a sequence of spoken speech elements, with each speech element having a respective position in the sequence, the digital data processing system being connected to means for receiving spoken speech elements of a word and interpreting each received speech element,wherein there is a plurality of possible speech elements, each spoken speech element is a speech element α

, each interpreted speech elements is a speech element β

, and each spoken speech element a may be interpreted as any one of a plurality of different speech elements β

, one of the speech elements β

being the same as speech element α

, said method comprising;

assigning to each of the possible speech elements a respective plurality of probabilities, P.sub.α

β

, that the speech element will be interpreted as a speech element β

when a speech element a has been spoken;

storing data representing each word of the plurality of words, the data for each word including identification of each speech element in the word and identification of the respective position of each speech element in the sequence of speech elements representing the word;

in the means for receiving and interpreting, receiving a sequence of speech elements spoken by a person and representing one of the stored words, and interpreting each speech element of the spoken word and the position of each speech element in the sequence of spoken speech elements; and

comparing the interpreted speech elements with stored data representing each word of the plurality of words and performing a computation, using the probability, P.sub.α

β

, associated with each interpreted speech element β

to identify the word of the plurality of words whose speech elements correspond most closely to interpreted speech elements.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for identifying any one of a plurality of words, each word having an audible form represented by a sequence of spoken speech elements, with each speech element having a respective position in the sequence, which involves: receiving spoken speech elements of a word and interpreting each received speech element, wherein each spoken speech element α may be interpreted as any one of a plurality of different speech elements β, one of the speech elements β being the same as speech element α; assigning to each of the possible speech elements a respective plurality of probabilities, P.sub.αβ, that the speech element will be interpreted as a speech element β when a speech element α has been spoken; storing data representing each word, the data for each word including identification of each speech element in the word and identification of the respective position of each speech element in the sequence of speech elements representing the word; receiving a sequence of speech elements spoken by a person and representing one of the stored words, and interpreting each speech element of the spoken word and the position of each speech element in the sequence of spoken speech elements; and comparing the interpreted speech elements with stored data representing each word of the plurality of words and performing a computation, using the probability, P.sub.αβ, associated with each interpreted speech element β to identify the word whose speech elements correspond most closely to interpreted speech elements.

Citations

12 Claims

1. A method for identifying any one of a plurality of words using a programmed digital data processing system, each word having an audible form represented by a sequence of spoken speech elements, with each speech element having a respective position in the sequence, the digital data processing system being connected to means for receiving spoken speech elements of a word and interpreting each received speech element,wherein there is a plurality of possible speech elements, each spoken speech element is a speech element α
- , each interpreted speech elements is a speech element β
  
  , and each spoken speech element a may be interpreted as any one of a plurality of different speech elements β
  
  , one of the speech elements β
  
  being the same as speech element α
  
  , said method comprising;
  
  assigning to each of the possible speech elements a respective plurality of probabilities, P.sub.α
  
  β
  
  , that the speech element will be interpreted as a speech element β
  
  when a speech element a has been spoken;
  
  storing data representing each word of the plurality of words, the data for each word including identification of each speech element in the word and identification of the respective position of each speech element in the sequence of speech elements representing the word;
  
  in the means for receiving and interpreting, receiving a sequence of speech elements spoken by a person and representing one of the stored words, and interpreting each speech element of the spoken word and the position of each speech element in the sequence of spoken speech elements; and
  
  comparing the interpreted speech elements with stored data representing each word of the plurality of words and performing a computation, using the probability, P.sub.α
  
  β
  
  , associated with each interpreted speech element β
  
  to identify the word of the plurality of words whose speech elements correspond most closely to interpreted speech elements.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. A method as defined in claim 1 wherein said step of performing a computation comprises summing the probabilities, P.sub.α
    - β
      
      , associated with the interpreted speech elements β
      
      of the received sequence of speech elements and with the speech elements α
      
      in the same positions as the interpreted speech elements for at least a number of the plurality of words, and determining that word of the number of words which is associated with the largest sum.
  - 3. A method as defined in claim 2 comprising the preliminary step of having each of the possible speech elements spoken a given number of times, N.sub.α
    - , interpreting each spoken speech element in the means for receiving and interpreting, determining the number of times, N.sub.α
      
      β
      
      , each spoken speech element α
      
      is interpreted as a speech element β
      
      , and for each combination of a respective spoken speech element α and
      
      a respective interpreted speech element β
      
      , calculating a probability, P.sub.α
      
      β
      
      , equal to N.sub.α
      
      β
      
      , for α
      
      =β
      
      , divided by the sum of all N.sub.α
      
      β
      
      for the respective interpreted speech element β and
      
      all spoken speech elements α
      
      .
  - 4. A method as defined in claim 1 comprising the further step, after said steps of comparing and performing a computation, recalculating the probabilities, P.sub.α
    - β
      
      by increasing, by one unit, each N.sub.α
      
      β
      
      associated with each interpreted speech element β and
      
      the speech element α
      
      in the same position as the interpreted speech element in the identified word.
  - 5. A method as defined in claim 1 wherein each speech element is a letter spoken when a word is spelled.
  - 6. A method as defined in claim 1 wherein each speech element is a phoneme pronounced when a word is spoken.

7. A programmed digital data processing system for identifying any one of a plurality of words, each word having an audible form represented by a sequence of spoken speech elements, with each speech element having a respective position in the sequence, wherein there is a plurality of possible speech elements, each spoken speech element is a speech element α
- , each interpreted speech elements ia speech element β
  
  , and each spoken speech element α
  
  may be interpreted as any one of a plurality of different speech elements β
  
  , one of the speech elements β
  
  being the same as speech element α
  
  , said apparatus comprising;
  
  first data storage means for storing, for each of the possible speech elements, a respective plurality of probabilities, P.sub.α
  
  β
  
  , that the speech element will be interpreted as a speech element β
  
  when a speech element α
  
  has been spoken;
  
  second data storage means for storing data representing each word of the plurality of words, the data for each word including identification of each speech element in the word and identification of the respective position of each speech element in the sequence of speech elements representing the word;
  
  means for receiving a sequence of speech elements spoken by a person and representing one of the stored words, and for interpreting each speech element of the spoken word and the position of each speech element in the sequence of spoken speech elements; and
  
  means connected for comparing the interpreted speech elements with stored data representing each word of the plurality of words and performing a computation, using the probability, P.sub.α
  
  β
  
  , associated with each interpreted speech element β
  
  to identify the word of the plurality of words whose speech elements correspond most closely to interpreted speech elements.
- View Dependent Claims (8, 9, 10)
- - 8. A system as defined in claim 7 wherein said means for comparing and performing a computation comprise means for summing the probabilities, P.sub.α
    - β
      
      , associated with the interpreted speech elements β
      
      of the received sequence of speech elements and with the speech elements α
      
      in the same positions as the interpreted speech elements for at least a number of the plurality of words, and means for determining that word of the number of words which is associated with the largest sum.
  - 9. A system as defined in claim 8 further comprising means for performing a preliminary step of having each of the possible speech elements spoken a given number of times, N.sub.α
    - , interpreting each spoken speech element in the means for receiving and interpreting, determining the number of times, N.sub.α
      
      β
      
      , each spoken speech element α
      
      is interpreted as a speech element β
      
      , and for each combination of a respective spoken speech element α and
      
      a respective interpreted speech element β
      
      , calculating a probability, P.sub.α
      
      β
      
      , equal to N.sub.α
      
      β
      
      for α
      
      =β
      
      , divided by the sum of all N.sub.α
      
      β
      
      for the respective interpreted speech element β and
      
      all spoken speech elements α
      
      .
  - 10. A system as defined in claim 7 further comprising means for recalculating the probabilities, P.sub.α
    - β
      
      by increasing, by one unit, each N.sub.α
      
      β
      
      associated with each interpreted speech element β and
      
      the speech element α
      
      in the same position as the interpreted speech element in the identified word.

11. A method for identifying any one of a plurality of words using a programmed digital computing system, each word having an audible form representable by a sequence of speech elements each having a respective position in the sequence, wherein each speech element has at least one identifiable acoustic characteristic and a plurality of the speech elements are substantially identical with respect to the at least one identifiable acoustic characteristic, said method comprising:
- storing, in the digital computing system, a digital representation corresponding to each of the plurality of words;
  
  receiving a sequence of speech elements spoken by a person and representing the audible form of one of the plurality of words, and storing representations of the received speech elements and their respective positions in the spoken sequence;
  
  at each position in the spoken sequence, determining each speech element, other than the speech element for which a representation is stored, which is substantially identical to the speech element for which a representation is stored with respect to the at least one identifiable acoustic characteristic,comparing combinations of speech elements for which representations are stored and determined speech elements for a word with stored words; and
  
  identifying the stored word for which the comparison produces the best match with one of the combinations of speech elements.
- View Dependent Claims (12)
- - 12. A method as defined in claim 11 further comprising reproducing the stored word which is identified in said identifying step.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Pronounced Technologies LLC
Original Assignee
Audio Navigation Systems, Inc.
Inventors
La Rue, Charles
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
Dorvil, Richemond

Application Number

US08/437,057
Time in Patent Office

1,092 Days
Field of Search

395/2.6, 395/2.63, 395/2, 395/2.4, 395/2.48, 395/2.84, 395/2.09, 395/2.64, 381/29-47
US Class Current

704/254
CPC Class Codes

B60R 16/0373   Voice control in general G10L

G01C 21/36   Input/output arrangements f...

G10L 15/08   Speech classification or se...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/187   Phonemic context, e.g. pron...

Methods and apparatus for improving the reliability of recognizing words in a large database when the words are spelled or spoken

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and apparatus for improving the reliability of recognizing words in a large database when the words are spelled or spoken

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links