Speech processing system
First Claim
Patent Images
1. An apparatus for generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the apparatus comprising:
- first receiving means for receiving signals representative of first and second spoken renditions of the new word;
speech recognition means for comparing the received first and second spoken renditions with pre-stored sub-word unit models to generate first and second sequences of sub-word units representative of said first and second spoken renditions of the new word respectively;
means for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; and
means for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning means.
1 Assignment
0 Petitions
Accused Products
Abstract
A system is provided for allowing a user to add word models to a speech recognition system. In particular, the system allows a user to input a number of renditions of the new word and which generates from these a sequence of phonemes representative of the new word. This representative sequence of phonemes is stored in a word to phoneme dictionary together with the typed version of the word for subsequent use by the speech recognition system.
-
Citations
62 Claims
-
1. An apparatus for generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the apparatus comprising:
-
first receiving means for receiving signals representative of first and second spoken renditions of the new word;
speech recognition means for comparing the received first and second spoken renditions with pre-stored sub-word unit models to generate first and second sequences of sub-word units representative of said first and second spoken renditions of the new word respectively;
means for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; and
means for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning means. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48)
-
-
23. An apparatus for generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the apparatus comprising:
-
means for receiving signals representative of a plurality of spoken renditions of the new word;
speech recognition means for comparing the received spoken renditions with pre-stored sub-word unit models to generate a corresponding plurality of sequences of sub-word units representative of said plurality of spoken renditions;
means for aligning sub-word units of each rendition with sub-word units of units of other renditions to form a number of aligned groups of sub-word units, each aligned group comprising a sub-word unit from each rendition; and
means for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned groups of sub-word units determined by said aligning means.
-
-
24. An apparatus for adding a new word and sub-word representation of the new word to a word dictionary of a speech recognition system, the apparatus comprising,
means for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word; -
means for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
means for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning means; and
means for adding the new word and the representative sequence of sub-word units to said word dictionary.
-
-
25. A speech recognition system comprising:
-
means for receiving speech signals to be recognised;
means for storing sub-word unit models;
means for comparing received speech with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
a word dictionary relating sequences of sub-word units to words;
a word decoder for processing the one or more sequences of sub-word units output by said comparing means using the word dictionary to generate one or more words corresponding to the received speech signals;
an apparatus for adding a new word and a sub-word representation of the new word to the word dictionary; and
means for controllably connecting the output of said comparing means to either said word decoder or said apparatus for adding the new word and a sub-word representation of the new word to the word dictionary;
characterised in that said apparatus for adding the new word and a sub-word representation of the new word to the word dictionary comprises;
means for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word output by said comparing means and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word output by said comparing means;
means for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
means for determining sequences of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning means;
means for receiving a text rendition of the new word; and
means for adding said text rendition of the new word and the representative sequence of sub-word units to said word dictionary.
-
-
26. A speech recognition system comprising:
-
means for receiving speech signals to be recognised;
means for storing sub-word unit models;
means for comparing received speech with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
a command dictionary relating sequences of sub-word units to commands; and
a command decoder for processing the one or more sequences of sub-word units output by said comparing means using the command dictionary to generate one or more commands corresponding to the received speech signals;
an apparatus for adding a new command and a sub-word representation of the new command to the command dictionary; and
means for controllably connecting the output of said comparing means to either said command decoder or said apparatus for adding the new command and a sub-word representation of the new command to the command dictionary;
characterised in that said apparatus for adding the new command and a sub-word representation of the new command to the command dictionary comprises;
means for receiving a first sequence of sub-word units representative of a first spoken rendition of the new command output by said comparing means and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new command output by said comparing means;
means for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
means for determining sequences of sub-word units representative of the spoken renditions of the new command in dependence upon the aligned pairs of sub-word units determined by said aligning means; and
means for adding said representative sequence of sub-word units to said command dictionary together with the corresponding new command.
-
-
27. A method of generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the method comprising the steps of:
-
a first receiving step of receiving signals representative of first and second spoken renditions of the new word;
comparing the received first and second spoken renditions with pre-stored sub-word unit models to generate a first sequence of sub-word units representative of said first spoken rendition of the new word and a second sequence of sub-word units representative of said second spoken rendition of the new word;
aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; and
determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step.
-
-
49. A method of generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the method comprising the steps of:
-
receiving signals representative of a plurality of spoken renditions of the new word;
comparing the received spoken renditions with pre-stored sub-word unit models to generate a corresponding plurality of sequences of sub-word units representative of said plurality of spoken renditions;
aligning sub-word units of each rendition with sub-word units of units of other renditions to form a number of aligned groups of sub-word units, each aligned group comprising a sub-word unit from each rendition; and
determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned groups of sub-word units determined by said aligning step.
-
-
50. A method of adding a new word and sub-word representation of the new word to a word dictionary of a speech recognition system, the method comprising the steps of:
-
receiving a first sequence of sub-word units representative of a first spoken rendition of the new word and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word;
aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step; and
adding the new word and the representative sequence of sub-word units to said word dictionary.
-
-
51. A speech recognition method comprising the steps of:
-
receiving speech signals to be recognised;
storing sub-word unit models;
comparing received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
storing a word dictionary relating sequences of sub-word units to words;
processing the one or more sequences of sub-word units output by said comparing step using the stored word dictionary to generate one or more words corresponding to the received speech signals;
the step of adding a new word and a sub-word representation of the new word to the word dictionary; and
controllably feeding the output of said comparing step to either said processing step or said adding step;
characterised in that said adding step comprises;
receiving a first sequence of sub-word units representative of a first spoken rendition of the new word output by said comparing step and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word output by said comparing step;
aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
determining sequences of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step;
receiving a text rendition of the new word; and
adding said text rendition of the new word and the representative sequence of sub-word units to said word dictionary.
-
-
52. A speech recognition method comprising the steps of:
-
receiving speech signals to be recognised;
storing sub-word unit models;
comparing received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
storing a command dictionary relating sequences of sub-word units to commands;
processing the one or more sequences of sub-word units output by said comparing step using the stored command dictionary to generate one or more commands corresponding to the received speech signals;
adding a new command and a sub-word representation of the new command to the command dictionary; and
controllably feeding the output of said comparing step to either said processing step or said adding step;
characterised in that said adding step comprises;
receiving a first sequence of sub-word units representative of a first spoken rendition of the new command output by said comparing step and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new command output by said comparing step;
aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
determining sequences of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step; and
adding said sequence of sub-word units to said command dictionary together with the associated new command.
-
-
53. A storage medium storing processor implementable instructions for controlling a processor to carry out a method of generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the processor instructions comprising:
-
first receiving instructions for receiving signals representative of first and second spoken renditions of the new word;
instructions for comparing the received first and second spoken renditions with pre-stored sub-word unit models to generate a first sequence of sub-word units representative of said first spoken rendition of the new word and a second sequence of sub-word units representative of said second spoken rendition of the new word;
instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; and
instructions for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step.
-
-
54. A storage medium storing processor implementable instructions for controlling a processor to carry out a method of generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the processor instructions comprising:
-
instructions for receiving signals representative of a plurality of spoken renditions of the new word;
instructions for comparing the received spoken renditions with pre-stored sub-word unit models to generate a corresponding plurality of sequences of sub-word units representative of said plurality of spoken renditions;
instructions for aligning sub-word units of each rendition with sub-word units of units of other renditions to form a number of aligned groups of sub-word units, each aligned group comprising a sub-word unit from each rendition; and
instructions for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned groups of sub-word units determined by said aligning step.
-
-
55. A storage medium storing processor implementable instructions for controlling a processor to carry out a method of adding a new word and sub-word representation of the new word to a word dictionary of a speech recognition system, the process instructions comprising:
-
instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word;
instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
instructions for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step; and
instructions for adding the new word and the representative sequence of sub-word units to said word dictionary.
-
-
56. A storage medium storing processor implementable instructions for controlling a processor to carry out a speech recognition method, the process instructions comprising:
-
instructions for receiving speech signals to be recognised;
instructions for storing sub-word unit models;
instructions for comparing received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
instructions for storing a word dictionary relating sequences of sub-word units to words;
instructions for processing the one or more sequences of sub-word units output by said comparing step using the stored word dictionary to generate one or more words corresponding to the received speech signals;
instructions for adding a new word and a sub-word representation of the new word to the word dictionary; and
instructions for controllably feeding the output of said comparing step to either said processing step or said adding step;
characterised in that said adding instructions comprise;
instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word output by said comparing step and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word output by said comparing step;
instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
instructions for determining sequences of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step;
instructions for receiving a text rendition of the new word; and
instructions for adding said text rendition of the new word and the representative sequence of sub-word units to said word dictionary.
-
-
57. A storage medium storing processor implementable instructions for controlling a processor to carry out a speech recognition method, the process instructions comprising:
-
instructions for receiving speech signals to be recognised;
instructions for storing sub-word unit models;
instructions for comparing received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
instructions for storing a command dictionary relating sequences of sub-word units to commands;
instructions for processing the one or more sequences of sub-word units output by said comparing step using the stored command dictionary to generate one or more commands corresponding to the received speech signals;
instructions for adding a new command and a sub-word representation of the new command to the command dictionary; and
instructions for controllably feeding the output of said comparing step to either said processing step or said adding step;
characterised in that said adding instructions comprise;
instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new command output by said comparing step and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new command output by said comparing step;
instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
instructions for determining sequences of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step; and
instructions for adding said sequence of sub-word units to said command dictionary together with the associated new command.
-
-
58. Processor implementable instructions for controlling a processor to carry out a method of generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the processor instructions comprising:
-
first receiving instructions for receiving signals representative of first and second spoken renditions of the new word;
instructions for comparing the received first and second spoken renditions with pre-stored sub-word unit models to generate a first sequence of sub-word units representative of said first spoken rendition of the new word and a second sequence of sub-word units representative of said second spoken rendition of the new word;
instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units; and
instructions for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step.
-
-
59. Processor implementable instructions for controlling a processor to carry out a method of generating a sequence of sub-word units representative of a new word to be added to a dictionary of a speech recognition system, the processor instructions comprising:
-
instructions for receiving signals representative of a plurality of spoken renditions of the new word;
instructions for comparing the received spoken renditions with pre-stored sub-word unit models to generate a corresponding plurality of sequences of sub-word units representative of said plurality of spoken renditions;
instructions for aligning sub-word units of each rendition with sub-word units of units of other renditions to form a number of aligned groups of sub-word units, each aligned group comprising a sub-word unit from each rendition; and
instructions for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned groups of sub-word units determined by said aligning step.
-
-
60. Processor implementable instructions for controlling a processor to carry out a method of adding a new word and sub-word representation of the new word to a word dictionary of a speech recognition system, the process instructions comprising:
-
instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word;
instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
instructions for determining a sequence of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step; and
instructions for adding the new word and the representative sequence of sub-word units to said word dictionary.
-
-
61. A storage medium storing processor implementable instructions for controlling a processor to carry out a speech recognition method, the process instructions comprising:
-
instructions for receiving speech signals to be recognised;
instructions for storing sub-word unit models;
instructions for comparing received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
instructions for storing a word dictionary relating sequences of sub-word units to words;
instructions for processing the one or more sequences of sub-word units output by said comparing step using the stored word dictionary to generate one or more words corresponding to the received speech signals;
instructions for adding a new word and a sub-word representation of the new word to the word dictionary; and
instructions for controllably feeding the output of said comparing step to either said processing step or said adding step;
characterised in that said adding instructions comprise;
instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new word output by said comparing step and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new word output by said comparing step;
instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
instructions for determining sequences of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step;
instructions for receiving a text rendition of the new word; and
instructions for adding said text rendition of the new word and the representative sequence of sub-word units to said word dictionary.
-
-
62. A storage medium storing processor implementable instructions for controlling a processor to carry out a speech recognition method, the process instructions comprising:
-
instructions for receiving speech signals to be recognised;
instructions for storing sub-word unit models;
instructions for comparing received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
instructions for storing a command dictionary relating sequences of sub-word units to commands;
instructions for processing the one or more sequences of sub-word units output by said comparing step using the stored command dictionary to generate one or more commands corresponding to the received speech signals;
instructions for adding a new command and a sub-word representation of the new command to the command dictionary; and
instructions for controllably feeding the output of said comparing step to either said processing step or said adding step;
characterised in that said adding instructions comprise;
instructions for receiving a first sequence of sub-word units representative of a first spoken rendition of the new command output by said comparing step and for receiving a second sequence of sub-word units representative of a second spoken rendition of the new command output by said comparing step;
instructions for aligning sub-word units of the first sequence with sub-word units of the second sequence to form a number of aligned pairs of sub-word units;
instructions for determining sequences of sub-word units representative of the spoken renditions of the new word in dependence upon the aligned pairs of sub-word units determined by said aligning step; and
instructions for adding said sequence of sub-word units to said command dictionary together with the associated new command.
-
Specification