Speech processing system
First Claim
Patent Images
1. An apparatus for generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the apparatus comprising:
- receiving means for receiving signals representative of first and second spoken renditions of the new word;
speech recognition means for comparing the received first and second spoken renditions with pre-stored sub-word unit models to generate first and second sequences of sub-word units representative of said first and second spoken renditions of the new word respectively;
means for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
first comparing means for comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set;
second comparing means for comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; and
means for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing means for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word.
1 Assignment
0 Petitions
Accused Products
Abstract
A system is provided for allowing a user to add word models to a speech recognition system. In particular, the system allows a user to input a number of renditions of the new word and which generates from these a sequence of phonemes representative of the new word. This representative sequence of phonemes is stored in a word to phoneme dictionary together with the typed version of the word for subsequent use by the speech recognition system.
167 Citations
78 Claims
-
1. An apparatus for generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the apparatus comprising:
-
receiving means for receiving signals representative of first and second spoken renditions of the new word; speech recognition means for comparing the received first and second spoken renditions with pre-stored sub-word unit models to generate first and second sequences of sub-word units representative of said first and second spoken renditions of the new word respectively; means for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; first comparing means for comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; second comparing means for comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; and means for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing means for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. An apparatus for adding a new word and sub-word representation of the new word to a word dictionary of a speech recognition system, the apparatus comprising:
-
means for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word; means for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; first comparing means for comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; second comparing means for comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; means for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing means for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; and means for adding the new word and the representative sequence of sub-word units to said word dictionary.
-
-
25. A speech recognition system comprising:
-
means for receiving speech signals to be recognised; means for storing sub-word unit models; means for matching received speech with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals; a word dictionary relating sequences of sub-word units to words; a word decoder for processing the one or more sequences of sub-word units output by said matching means using the word dictionary to generate one or more words corresponding to the received speech signals; an apparatus for adding a new word and a sub-word representation of the new word to the word dictionary; and mean for controllably connecting the output of said matching means to either said word decoder or said apparatus for adding the new word and a sub-word representation of the new word to the word dictionary; characterised in that said apparatus for adding the new word and a sub-word representation of the new word to the word dictionary comprises; means for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word output by said comparing means and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word output by said comparing means; means for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; first comparing means for comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; second comparing means for comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; means for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing means for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; means for receiving a text rendition of the new word; and means for adding said text rendition of the new word and the representative sequence of sub-word units to said word dictionary.
-
-
26. A method of generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the method comprising:
-
receiving signals representative of first and second spoken renditions of the new word; comparing the received first and second spoken renditions with pre-stored sub-word unit models to generate a first sequence of sub-word units representative of said first spoken rendition of the new word and a second sequence of sub-word units representative of said second spoken rendition of the new word; aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; a first comparing step of comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; a second comparing step of comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; and determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing steps for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47)
-
-
48. A method of adding a new word and sub-word representation of the new word to a word dictionary of a speech recognition system, the method comprising the steps of:
-
receiving a first sequence of sub-word units representative of a first spoken rendition of the new word and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word; aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; a first comparing step of comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; a second comparing step of comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing step for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; and adding the new word and the representative sequence of sub-word units to said word dictionary.
-
-
49. A speech recognition method comprising the steps of:
-
receiving speech signals to be recognised; storing sub-word unit models; matching received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals; storing a word dictionary relating sequences of sub-word units to words; processing the one or more sequences of sub-word units output by said matching step using the stored word dictionary to generate one or more words corresponding to the received speech signals; a step of adding a new word and a sub-word representation of the new word to the word dictionary; and controllably feeding the output of said matching step to either said processing step or said adding step; characterised in that said adding step comprises; receiving a first sequence of sub-word units representative of a first spoken rendition of the new word output by said comparing step and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word output by said comparing step; aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; a first comparing step of comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; a second comparing step of comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing steps for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; receiving a text rendition of the new word; and adding said text rendition of the new word and the representative sequence of sub-word units to said word dictionary.
-
-
50. A storage medium storing processor implementable instructions for controlling a processor to carry out a method of generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the processor instructions comprising:
-
receiving instructions for receiving signals representative of first and second spoken renditions of the new word; instructions for comparing the received first and second spoken renditions with pre-stored sub-word unit models to generate a first sequence of sub-word units representative of said first spoken rendition of the new word and a second sequence of sub-word units representative of said second spoken rendition of the new word; instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; instructions for a first comparing step of comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; instructions for a second comparing step of comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; and instructions for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing steps for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word.
-
-
51. A storage medium storing processor implementable instructions for controlling a processor to carry out a method of adding a new word and sub-word representation of the new word to a word dictionary of a speech recognition system, the process instructions comprising:
-
instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word; instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; instructions for a first comparing step of comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; instructions for a second comparing step of comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; instructions for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing steps for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; and instructions for adding the new word and the representative sequence of sub-word units to said word dictionary.
-
-
52. A storage medium storing processor implementable instructions for controlling a processor to carry out a speech recognition method, the process instructions comprising:
-
instructions for receiving speech signals to be recognised; instructions for storing sub-word unit models; instructions for matching received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals; instructions for storing a word dictionary relating sequences of sub-word units to words; instructions for processing the one or more sequences of sub-word units output by said matching step using the stored word dictionary to generate one or more words corresponding to the received speech signals; instructions for adding a new word and a sub-word representation of the new word to the word dictionary; and instructions for controllably feeding the output of said matching step to either said processing step or said adding step; characterised in that said adding instructions comprise; instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word output by said comparing step and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word output by said matching step; instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; instructions for a first comparing step of comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; instructions for a second comparing step of comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; instructions for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing steps for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; instructions for receiving a text rendition of the new word; and instructions for adding said text rendition of the new word and the representative sequence of sub-word units to said word dictionary.
-
-
53. Processor implementable instructions for controlling a processor to carry out a method of generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the processor instructions comprising:
-
instructions for receiving signals representative of first and second spoken renditions of the new word; instructions for matching the received first and second spoken renditions with pre-stored sub-word unit models to generate a first sequence of sub-word units representative of said first spoken rendition of the new word and a second sequence of sub-word units representative of said second spoken rendition of the new word; instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; instructions for a first comparing step of comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; instructions for a second comparing step of comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; and instructions for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing steps for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word.
-
-
54. Processor implementable instructions for controlling a processor to carry out a method of adding a new word and sub-word representation of the new word to a word dictionary of a speech recognition system, the processor instructions composing:
-
instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word; instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; instructions for a first comparing step of comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; instructions for a second comparing step of comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; instructions for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing steps for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; and instructions for adding the new word and the representative sequence of sub-word units to said word dictionary.
-
-
55. Processor implementable instructions for controlling a processor to carry out a speech recognition method, the processor instructions comprising:
-
instructions for receiving speech signals to be recognised; instructions for storing sub-word unit models; instructions for matching received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals; instructions for storing a word dictionary relating sequences of sub-word units to words; instructions for processing the one or more sequences of sub-word units output by said matching step using the stored word dictionary to generate one or more words corresponding to the received speech signals; instructions for adding a new word and a sub-word representation of the new word to the word dictionary; and instructions for controllably feeding the output of said matching step to either said processing step or said adding step; characterised in that said adding instructions comprise; instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word output by said matching step and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word output by said matching step; instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; instructions for a first comparing step of comparing, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; instructions for a second comparing step of comparing, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; instructions for determining, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing steps for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; instructions for receiving a text rendition of the new word; and instructions for adding said text rendition of the new word and the representative sequence of sub-word units to said word dictionary.
-
-
56. An apparatus for generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the apparatus comprising:
-
a receiver operable to receive signals representative of first and second spoken renditions of the new word; a speech recogniser operable to compare the received first and second spoken renditions with pre-stored sub-word unit models to generate first and second sequence of sub-word units representative of said first and second spoken renditions of the new word respectively; a sub-word unit aligner operable to align sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; a first comparator operable to compare, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; a second comparator operable to compare, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; and a determiner operable to determine, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparing means for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word. - View Dependent Claims (57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76)
-
-
77. An apparatus for adding a new word and sub-word representation of the new word to a word dictionary of a speech recognition system, the apparatus comprising;
-
a receiver operable to receive a first sequence of sub-word units representative of a first spoken rendition of the new word and to receive a second sequence of sub-word units representative of a second spoken rendition of the new word; a sub-word unit aligner operable to align sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; a first comparator operable to compare, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; a second comparator operable to compare, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; a determiner operable to determine, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparators for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; and an apparatus operable to add the new word and the representative sequence of sub-word units to said word dictionary.
-
-
78. A speech recognition system comprising:
-
a first receiver operable to receive speech signals to be recognised; a store operable to store sub-word unit models; a comparator operable to compare received speech with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals; a word dictionary relating sequences of sub-word units to words; a word decoder operable to process the one or more sequences of sub-word units output by said comparator using the word dictionary to generate one or more words corresponding to the received speech signals; a first apparatus operable to add a new word and a sub-word representation of the new word to the word dictionary; and a switch operable to controllably connect the output of said comparator to either said word decoder or said apparatus for adding the new word and a sub-word representation of the new word to the word dictionary; characterised in that said apparatus for adding the new word and a sub-word representation of the new word to the word dictionary comprises; a second receiver operable to receive a first sequence of sub-word units representative of a first spoken rendition of the new word output by said comparator and to receive a second sequence of sub-word units representative of a second spoken rendition of the new word output by said comparator; a sub-word unit aligner operable to align sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; a first comparator operable to compare, for each aligned pair, the first sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of comparison scores representative of the similarities between the first sequence sub-word unit and the respective sub-word units of the set; a second comparator operable to compare, for each aligned pair, the second sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set, to generate a further corresponding plurality of comparison scores representative of the similarities between said second sequence sub-word unit and the respective sub-word units of the set; a determiner operable to determine, for each aligned pair of sub-word units, a sub-word unit representative of the sub-word units in the aligned pair in dependence upon the comparison scores generated by said first and second comparators for the aligned pair, to determine a sequence of sub-word units representative of the spoken renditions of the new word; a third receiver operable to receive a text rendition of the new word; and a second apparatus operable to add said text rendition of the new word and the representative sequence of sub-word units to said word dictionary.
-
Specification