Speech recognition system with efficient storage and rapid assembly of phonological graphs
First Claim
1. An apparatus for storing electronic representation of words, said apparatus comprising:
- means for storing a first electronic signal representing at least two alternative pronunciations of a portion of speech, said first signal comprising data having a data length, said first signal being identified by a first identifier with a length less than the data length of the first signal;
means for storing a second electronic signal representing a first word comprising the first portion of speech, said second signal comprising the first identifier for representing at least a portion of the first word; and
means for storing a third electronic signal representing a second word different from the first word, said second word comprising the first portion of speech, said third signal comprising the first identifier for representing at least a portion of the second word.
1 Assignment
0 Petitions
Accused Products
Abstract
A continuous speech recognition system having a speech processor and a word recognition computer subsystem, characterized by an element for developing a graph for confluent links between confluent nodes; an element for developing a graph of boundary links between adjacent words; an element for storing an inventory of confluent links and boundary links as a coding inventory; an element for converting an unknown utterance into an encoded sequence of confluent links and boundary links corresponding to recognition sequences stored in the word recognition subsystem recognition vocabulary for speech recognition. The invention also includes a method for achieving continouous speech recognition by characterizing speech as a sequence of confluent links which are matched with candidate words. The invention also applies to isolated word speech recognition as with continuous speech recognition, except that in such case there are no boundary links.
114 Citations
23 Claims
-
1. An apparatus for storing electronic representation of words, said apparatus comprising:
-
means for storing a first electronic signal representing at least two alternative pronunciations of a portion of speech, said first signal comprising data having a data length, said first signal being identified by a first identifier with a length less than the data length of the first signal; means for storing a second electronic signal representing a first word comprising the first portion of speech, said second signal comprising the first identifier for representing at least a portion of the first word; and means for storing a third electronic signal representing a second word different from the first word, said second word comprising the first portion of speech, said third signal comprising the first identifier for representing at least a portion of the second word. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An apparatus for storing electronic representations of words, said apparatus comprising:
-
means for storing a first electronic signal representing at least two alternative pronunciations of a first portion of speech, the occurrence of one of any one of said alternative pronunciations of the first portion of speech being independent of the occurrence of other portions of speech preceding or following the first portion of speech, said first signal comprising data having a data length, said first signal being identified by a first identifier with a length less than the data length of the first signal; means for storing a second electronic signal representing at least two alternative pronunciations of a second portion of speech different from the first portion of speech, the occurrence of any one of said alternative pronunciations of the second portion of speech being dependent on the occurrence of other portions of speech following the second portion of speech, the occurrence of any one of said alternative pronunciations of the second portion of speech being independent of the occurrence of other portions of speech preceding the second portion of speech, said second signal comprising data having a data length, said second signal being identified by a second identifier with a length less than the data length of the second signal; means for storing a third electronic signal representing at least two alternative pronunciations of a third portion of speech different from the first and second portions of speech, the occurrence of any one of said alternative pronunciations of the third portion of speech being dependent on the occurrence of other portions of speech preceding the third portion of speech, the occurrence of any of one of said alternative pronunciations of the third portion of speech being independent of the occurrence of other portions of speech following the third portion of speech, said third signal comprising data having a data length, said third signal being identified by a third identifier with a length less than the data length of the third signal; means for storing a fourth electronic signal representing a first word comprising the second portion of speech, said fourth signal comprising the second identifier for representing at least a portion of the first word; means for storing a fifth electronic signal representing a second word different from the first word, said second word comprising the third portion of speech, said fifth signal comprising the third identifier for representing at least a portion of the second word; and means for storing a sixth electronic signal comprising the first identifier for representing the second portion of speech followed by the third portion of speech.
-
-
7. A method of storing electronic representations of words, said method comprising:
-
storing a first electronic signal representing at least two alternative pronunciations of a portion of speech, said first signal comprising data having a data length, said first signal being identified by a first identifier with a length less than the data length of the first signal; storing a second electronic signal representing a first word comprising the first portion of speech, said second signal comprising the identifier for representing at least a portion of the first word; and storing a third electronic signal representing a second word different from the first word, said second word comprising the first portion of speech, said third signal comprising the first identifier for representing at least a portion of the second word. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A method of storing electronic representations of words, said method comprising:
-
storing a first electronic signal representing at least two alternative pronunciations of a first portion of speech, the occurrence of any one of said alternative pronunciations of the first portion of speech being independent of the occurrence of other portions of speech preceding or following the first portion of speech, said first signal comprising data having a data length, said first signal being identified by a first identifier with a length less than the data length of the first signal; storing a second electronic signal representing at least two alternative pronunciations of a second portion of speech different from the first portion of speech the occurrence of any one of said alternative pronunciations of the second portion of speech being dependent on the occurrence of other portions of speech following the second portion of speech, the occurrence of any one of said alternative pronunciations of the second portion of speech being independent of the occurrence of other portions of speech preceding the second portion of speech, said second signal comprising data having a data length, said second signal being identified by a second identifier with a length less than the data length of the second signal; storing a third electronic signal representing at least two alternative pronunciations of a third portion of speech different from the first and second portions of speech, the occurrence of any one of said alternative pronunciations of the third portion of speech being dependent on the occurrence of other portions of speech preceding the third portion of speech, the occurrence of any one of said alternative pronunciations of the third portion of speech being independent of the occurrence of other portions of speech following the third portion of speech, said third signal comprising data having a data length, said third signal being identified by a third identifier with a length less than the data length of the third signal; storing a fourth electronic signal representing a first word comprising the second portion of speech, said fourth signal comprising the second identifier for representing at least a portion of the first word; storing a fifth electronic signal representing a second word different from the first word, said second word comprising the third portion of speech, said fifth signal comprising the third identifier for representing at least a portion of the second word; and storing a sixth electronic signal comprising the first identifier for representing the second portion of speech followed by the third portion of speech.
-
-
13. An apparatus for storing electronic representations of words, said apparatus comprising:
-
means for storing a speech unit signal, said speech unit signal comprising first data representing at least a first pronunciation of a first speech unit, said speech unit signal comprising second data representing at least a second pronunciation different from the first pronunciation of the first speech unit, said first and second data having a data length, said first signal being identified by a first identifier with a length less than the data length of the first signal; means for storing a first word signal representing a first word comprising the first speech unit, said first word signal comprising the first identifier for representing at least a portion of the first word; means for storing a second word signal representing a second word different from the first word, said second word comprising the first speech unit, said second word signal comprising the first identifier for representing at least a portion of the second word; means for retrieving the first word signal and for retrieving the first identifier for representing at least a portion of the first word; means for identifying the stored speech unit signal from the retrieved first identifier, and for retrieving the first data or the second data from the speech unit signal; and means for generating a phone machine for one pronunciation of the portion of the first word from the retrieved first data or second data, but not both. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A method of storing electronic representations of words, said method comprising the step of:
-
storing a speech unit signal, said speech unit signal comprising first data representing at least a first pronunciation of a first speech unit, said speech unit signal comprising second data representing at least a second pronunciation different from the first pronunciation of the first speech unit, said first and second data having a data length, said first signal being identified by a first identifier with a length less than the data length of the first signal; storing a first word signal representing a first word comprising the first speech unit, said first word signal comprising the first identifier for representing at least a portion of the fist word; storing a second word signal representing a second word different from the first word, said second word comprising the first speech unit, said second word signal comprising the first identifier for representing at least a portion of the second word; retrieving the first word signal and retrieving the first identifier for representing at least a portion of the first word; identifying the stored speech unit signal from the retrieved first identifier, and retrieving the first data or the second data from the speech unit signal; and generating a phone machine for one pronunciation of the portion of the first word from the retrieved first data or second data, but not both. - View Dependent Claims (19, 20, 21, 22)
-
-
23. A speech recognition apparatus comprising:
-
means for storing a speech unit signal, said speech unit signal comprising first data representing at least a first pronunciation of a first speech unit, said speech unit signal comprising second data representing at least a second pronunciation different from the first pronunciation of the first speech unit, said first and second data having a data length, said first signal being identified by a first identifier with a length less than the data length of the first signal; means for storing a first word signal representing a first word comprising the first speech unit, said first word signal comprising the first identifier for representing at least a portion of the first word; means for storing a second word signal representing a second word different from the first word, said second word comprising the first speech unit, said second word signal comprising the first identifier for representing at least a portion of the second word; means for retrieving the first word signal and for retrieving the first identifier for representing at least a portion of the first word; means for identifying the stored speech unit signal from the retrieved first identifier, and for retrieving the first data or the second data from the speech unit signal; means for generating a phone machine for one pronunciation of the portion of the first word from the retrieved first data or second data, but not both; means for converting a spoken sound into an utterance signal; and means for matching the utterance signal to the phone machine and for outputting a match score signal proportional to the likelihood of the phone machine producing the utterance signal.
-
Specification