Speech processing system
First Claim
Patent Images
1. An apparatus for identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the apparatus comprising:
- first receiving means for receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
second receiving means for receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
means for comparing sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison scores;
means for combining the comparison scores obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary word, a measure of the similarity between the recognised sequence and the dictionary sequence; and
means for identifying said one or more words using the similarity measures provided by the combining means;
wherein said comparing means comprises;
means for aligning sub-word units of the recognised sequence with sub-word units of the same dictionary sequence to form, for each dictionary sequence, a number of aligned pairs of sub-word units;
first sub-comparing means for comparing, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of intermediate comparison scores representative of the similarities between the recognised sequence sub-word unit and the respective sub-word units of the set;
second sub-comparing means for comparing, for each aligned pair, the dictionary sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set to separate a further corresponding plurality of intermediate comparison scores representative of the similarities between said dictionary sequence sub-word unit and the respective sub-word units of the set; and
means for calculating, for each aligned pair, a comparison score representative of the similarity between the sub-word units of the aligned pair by combining the pluralities of intermediate comparison scores generated by said first and second sub-comparing means, to provide said set of comparison scores.
1 Assignment
0 Petitions
Accused Products
Abstract
A system is provided for decoding one or more sequences of sub-word units output by a speech recognition system into one or more representative words. The system uses a dynamic programming technique to align the sequence of sub-word units output by the recognition system with a number of dictionary sub-word unit sequences representative of dictionary words to identify the most likely word or words corresponding to the spoken input.
71 Citations
70 Claims
-
1. An apparatus for identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the apparatus comprising:
-
first receiving means for receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
second receiving means for receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
means for comparing sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison scores;
means for combining the comparison scores obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary word, a measure of the similarity between the recognised sequence and the dictionary sequence; and
means for identifying said one or more words using the similarity measures provided by the combining means;
wherein said comparing means comprises;
means for aligning sub-word units of the recognised sequence with sub-word units of the same dictionary sequence to form, for each dictionary sequence, a number of aligned pairs of sub-word units;
first sub-comparing means for comparing, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of intermediate comparison scores representative of the similarities between the recognised sequence sub-word unit and the respective sub-word units of the set;
second sub-comparing means for comparing, for each aligned pair, the dictionary sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set to separate a further corresponding plurality of intermediate comparison scores representative of the similarities between said dictionary sequence sub-word unit and the respective sub-word units of the set; and
means for calculating, for each aligned pair, a comparison score representative of the similarity between the sub-word units of the aligned pair by combining the pluralities of intermediate comparison scores generated by said first and second sub-comparing means, to provide said set of comparison scores. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
8. An apparatus according to claim 7, wherein the confusion probabilities for the recognised sequence sub-word units and the dictionary sequence sub-word units are determined in advance and depend upon the recognition system that was used to generate the respective sub-word unit sequences.
-
9. An apparatus according to claim 5, wherein said intermediate comparison scores represent log probabilities and wherein said calculating means is operable to multiply said probabilities by adding the respective intermediate comparison scores.
-
10. An apparatus according to claim 1, wherein each of the sub-word units in said dictionary and recognised sequences of sub-word units belong to said set of predetermined sub-word units and wherein said first and second sub-comparing means are operable to provide said comparison scores using predetermined data which relate the sub-word units in said set to each other.
-
11. An apparatus according to claim 10, wherein said predetermined data comprises, for each sub-word unit in the set of sub-word units, a probability for confusing that sub-word unit with each of the other sub-word units in the set of sub-word units.
-
12. An apparatus according to claim 1, wherein said aligning means comprises dynamic programming means for aligning said dictionary and recognised sequences of sub-word units using a dynamic programming technique.
-
13. An apparatus according to claim 1, wherein each of said sub-word units represents a phoneme.
-
14. An apparatus according to claim 1, wherein the calculating means is operable to combine the intermediate comparison scores obtained by the first and second sub-comparing means when comparing the recognised sequence sub-word unit and the dictionary sequence sub-word unit in an aligned pair with the same sub-word unit from the set of predetermined sub-word units, to generate a plurality of combined intermediate comparison scores and is operable to generate said comparison score for the aligned pair from the plurality of combined intermediate comparison scores generated for the aligned pair.
-
15. An apparatus according to claim 1, wherein said comparing means has a plurality of different comparison modes of operation and wherein the apparatus further comprises:
-
means for determining if the current dictionary sequence of sub-word units was generated from an audio input or a typed input and for outputting a determination result; and
means for selecting, for the current dictionary sub-word sequence, the mode of operation of said comparing means in dependence upon said determination result.
-
-
16. An apparatus according to claim 15, wherein said comparing means includes third sub-comparing means operable to compare, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with the dictionary sequence sub-word unit in the aligned pair by calculating:
-
17. An apparatus according to claim 16, wherein the selecting means is operable to select said first comparison mode of said comparing means when said determining means determines that the current dictionary sequence of sub-word units was generated from an audio input and to select said second comparison mode of said comparing means when said determining means determines that the current dictionary sequence of sub-word units was generated from a typed input.
-
18. An apparatus for identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the apparatus comprising:
-
first receiving means for receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
second receiving means for receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
means for comparing sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison results;
means for combining the comparison results obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
means for identifying said one or more words using the similarity measures provided by the combining means;
wherein said comparing means has a plurality of different comparison modes of operation and wherein the apparatus further comprises;
means for determining if the current dictionary sequence of sub-word units was generated from an audio input or a typed input and for outputting a determination result; and
means for selecting, for the current dictionary sub-word sequence, the mode of operation of said comparing means in dependence upon said determination result. - View Dependent Claims (19, 20)
-
-
21. A speech recognition system comprising:
-
means for receiving speech signals to be recognised;
means for storing sub-word unit models;
means for comparing the received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
a word dictionary relating sequences of sub-word units to words; and
a word decoder for processing the one or more sequences of sub-word units output by said comparing means using the word dictionary to generate one or more words corresponding to the received speech signals;
wherein said word decoder comprises;
first receiving means for receiving the recognised sequence of sub-word units representative of the received speech signals;
second receiving means for receiving from said word dictionary a plurality of dictionary sub-word sequences, each representative of one or more known words;
means for comparing sub-word units of the recognised sequence with sub-word units of each received dictionary sequence to provide a set of comparison scores;
means for combining the comparison scores obtained by comparing the sub-word units of the recognised sequence with the sub-word units of each received dictionary to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and each received dictionary sequence; and
means for identifying said one or more words using the similarity measures provided by the combining means;
wherein said comparing means comprises;
means for aligning sub-word units of the recognised sequence with sub-word units of each received dictionary sequence to form a number of aligned pairs of sub-word units;
first sub-comparing means for comparing, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of intermediate comparison scores representative of the similarities between the recognised sequence sub-word unit and the respective sub-word units of the set;
second sub-comparing means for comparing, for each aligned pair, the dictionary sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set to generate a further corresponding plurality of intermediate comparison scores representative of the similarities between said dictionary sequence sub-word unit and the respective sub-word units of the set; and
means for calculating, for each aligned pair, a comparison score representative of the similarity between the sub-word units of the aligned pair by combining the pluralities of intermediate comparison scores generated by said first and second sub-comparing means, to provide said set of comparison scores.
-
-
22. A method of identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the method comprising:
-
a first receiving step of receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
a second receiving step of receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
comparing sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison scores;
combining the comparison scores obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
identifying said one or more words using the similarity measures provided by the combining step;
wherein said comparing step comprises;
aligning sub-word units of the recognised sequence with sub-word units of the same dictionary sequence to form, for each dictionary sequence, a number of aligned pairs of sub-word units;
a first sub-comparing step for comparing, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of intermediate comparison scores representative of the similarities between the recognised sequence sub-word unit and the respective sub-word units of the set;
a second sub-comparing step of comparing, for each aligned pair, the dictionary sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set to generate a further corresponding plurality of intermediate comparison scores representative of the similarities between said dictionary sequence sub-word unit and the respective sub-word units of the set; and
calculating, for each aligned pair, a comparison score representative of the similarity between the sub-word units of the aligned pair by combining the pluralities of intermediate comparison scores generated by said first and second sub-comparing steps, to provide said set of comparison scores. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 49, 50, 51, 52, 53, 54)
-
-
29. A method according to claim 28, wherein the confusion probabilities for the recognised sequence sub-word units and the dictionary sequence sub-word units are determined in advance and depend upon the recognition system that was used to generate the respective sub-word unit sequences.
-
30. A method according to claim 26, wherein said intermediate comparison scores represent log probabilities and wherein said calculating step multiplies said probabilities by adding the respective intermediate comparison scores.
-
31. A method according to claim 22, wherein each of the sub-word units in said dictionary and recognised sequences of sub-word units belong to said set of predetermined sub-word units and wherein said first and second sub-comparing means are operable to provide said comparison scores using predetermined data which relate the sub-word units in said set to each other.
-
32. A method according to claim 31, wherein said predetermined data comprises, for each sub-word unit in the set of sub-word units, a probability for confusing that sub-word unit with each of the other sub-word units in the set of sub-word units.
-
33. A method according to claim 22, wherein said aligning step uses a dynamic programming technique to align said dictionary and recognised sequences of sub-word units.
-
34. A method according to claim 22, wherein each of said sub-word units represents a phoneme.
-
35. A method according to claim 22, wherein the calculating step combines the intermediate comparison scores obtained by the first and second sub-comparing steps when comparing the recognised sequence sub-word unit and the dictionary sequence sub-word unit in an aligned pair with the same sub-word unit from the set of predetermined sub-word units, to generate a plurality of combined intermediate comparison scores and generates said comparison score for the aligned pair from the plurality of combined intermediate comparison scores generated for the aligned pair.
-
36. A method according to claim 22, further comprising the steps of:
-
determining if the current dictionary sequence of sub-word units was generated from an audio input or a typed input and outputting a determination result; and
selecting, for the current dictionary sub-word sequence, a comparison technique employed in said comparing step in dependence upon said determination result.
-
-
37. A method according to claim 36, wherein in a first comparison technique, said comparing step comprises said first and second comparing steps and said calculating step and in a second comparison technique comprises a third sub-comparing step of comparing, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with the dictionary sequence sub-word unit in the aligned pair by calculating:
-
38. A method according to claim 37, wherein the selecting step selects said first comparison technique when said determining step determines that the current dictionary sequence of sub-word units was generated from an audio input and selects said second comparison technique when said determining step determines that the current dictionary sequence of sub-word units was generated from a typed input.
-
49. An apparatus according to claim 38, wherein said first and second sub-comparators are operable to provide intermediate comparison scores which are indicative of a probability of confusing the corresponding sub-word unit taken from said set of predetermined sub-word units as the sub-word unit in the aligned pair.
-
50. An apparatus according to claim 49, wherein said calculator is operable to combine the intermediate comparison scores in order to multiply the probabilities of confusing the corresponding sub-word unit taken from the set as the sub-word units in the aligned pair.
-
51. An apparatus according to claim 50, wherein each of said sub-word units in said set of predetermined sub-word units has a predetermined probability of occurring within a sequence of sub-word units and wherein said calculator is operable to weight each of the combined comparison scores in dependence upon the respective probability of occurrence for the sub-word unit of the set used to generate the combined comparison score.
-
52. An apparatus according to claim 51, wherein said calculator is operable to combine said intermediate comparison scores by calculating:
-
53. An apparatus according to claim 52, wherein the confusion probabilities for the recognised sequence sub-word units and the dictionary sequence sub-word units are determined in advance and depend upon the recognition system that was used to generate the respective sub-word unit sequences.
-
54. An apparatus according to claim 50, wherein said intermediate comparison scores represent log probabilities and wherein said calculator is operable to multiply said probabilities by adding the respective intermediate comparison scores.
-
39. A method of identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the method comprising:
-
a first receiving step of receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
a second receiving step of receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
comparing, using a sub-word comparator, sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison results;
combining the comparison results obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
identifying said one or more words using the similarity measures provided by the combining step;
wherein said sub-word comparator has a plurality of different comparison modes of operation and wherein the method further comprises;
determining if the current dictionary sequence of sub-word units was generated from an audio input or a typed input and for outputting a determination result; and
selecting, for the current dictionary sub-word sequence, the mode of operation of said sub-word unit comparator in dependence upon said determination result. - View Dependent Claims (40, 41)
-
-
42. A speech recognition method comprising:
-
receiving speech signals to be recognised;
comparing the received speech signals with stored sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals; and
processing the one or more sequences of sub-word units output by said comparing step using a stored word dictionary to generate one or more words corresponding to the received speech signal, wherein said processing step comprises;
a first receiving step of receiving the recognised sequence of sub-word units representative of the received speech signals;
a second receiving step of receiving from said word dictionary a plurality of dictionary sub-word sequences, each representative of one or more known words;
comparing sub-word units of the recognised sequence with sub-word units of each received dictionary sequence to provide a set of comparison scores;
combining the comparison scores obtained by comparing the sub-word units of the recognised sequence with the sub-word units of each received dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and each received dictionary sequence; and
identifying said one or more words using the similarity measures provided by the combining step, wherein said comparing step comprises;
aligning sub-word units of the recognised sequence with sub-word units of each received dictionary sequence to form a number of aligned pairs of sub-word units;
a first sub-comparing step of comparing, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of intermediate comparison scores representative of the similarities between the recognised sequence sub-word unit and the respective sub-word units of the set;
a second sub-comparing step for comparing, for each aligned pair, the dictionary sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set to generate a further corresponding plurality of intermediate comparison scores representative of the similarities between said dictionary sequence sub-word unit and the respective sub-word units of the set; and
calculating, for each aligned pair, a comparison score representative of the similarity between the sub-word units of the aligned pair by combining the pluralities of intermediate comparison scores generated by said first and second sub-comparing steps, to provide said set of comparison scores.
-
-
43. A storage medium storing processor implementable instructions for controlling a processor to implement a method of identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the processor implementable instructions comprising:
-
instructions for a first receiving step of receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
instructions for a second receiving step of receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
instructions for comparing sub-word units of the recognised sequence with sub-word units of each received dictionary sequence to provide a set of comparison scores;
instructions for combining the comparison scores obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
instructions for identifying said one or more words using the similarity measures provided by the combining step, wherein said instructions for said comparing step include;
instructions for aligning sub-word units of the recognised sequence with sub-word units of the same dictionary sequence to form, for each dictionary sequence, a number of aligned pairs of sub-word units;
instructions for a first sub-comparing step of comparing, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of intermediate comparison scores representative of the similarities between the recognised sequence sub-word unit and the respective sub-word units of the set;
instructions for a second sub-comparing step of comparing, for each aligned pair, the dictionary sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set to generate a further corresponding plurality of intermediate comparison scores representative of the similarities between said dictionary sequence sub-word unit and the respective sub-word units of the set; and
instructions for calculating, for each aligned pair, a comparison score representative of the similarity between the sub-word units of the aligned pair by combining the pluralities of intermediate comparison scores generated by said first and second sub-comparing steps, to provide said set of comparison scores.
-
-
44. Processor implementable instructions for controlling a processor to implement a method of identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the processor implementable instructions comprising:
-
instructions for a first receiving step of receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
instructions for a second receiving step of receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
instructions for comparing sub-word units of the recognised sequence with sub-word units of each received dictionary sequence to provide a set of comparison scores;
instructions for combining the comparison scores obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
instructions for identifying said one or more words using the similarity measures provided by the combining step, wherein said instructions for said comparing step include;
instructions for aligning sub-word units of the recognised sequence with sub-word units of the same dictionary sequence to form, for each dictionary sequence, a number of aligned pairs of sub-word units;
instructions for a first sub-comparing step of comparing, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of intermediate comparison scores representative of the similarities between the recognised sequence sub-word unit and the respective sub-word units of the set;
instructions for a second sub-comparing step of comparing, for each aligned pair, the dictionary sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set to generate a further corresponding plurality of intermediate comparison scores representative of the similarities between said dictionary sequence sub-word unit and the respective sub-word units of the set; and
instructions for calculating, for each aligned pair, a comparison score representative of the similarity between the sub-word units of the aligned pair by combining the pluralities of intermediate comparison scores generated by said first and second sub-comparing steps, to provide said set of comparison scores.
-
-
45. An apparatus for identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the apparatus comprising:
-
a first receiver operable to receive the recognised sequence of sub-word units representative of the one or more words to be identified;
a second receiver operable to receive a plurality of dictionary sub-word sequences, each representative of one or more known words;
a comparison score generator operable to compare sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison scores;
a similarity measure generator operable to combine the comparison scores obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
a word identifier operable to identify said one or more words using similarity measures provided by the similarity measure generator, wherein said comparison score generator comprises;
a sub-word unit aligner operable to align sub-word units of the recognised sequence with sub-word units of the same dictionary sequence to form, for each dictionary sequence, a number of aligned pairs of sub-word units;
a first sub-comparator operable to compare, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of intermediate comparison scores representative of the similarities between the recognised sequence sub-word unit and the respective sub-word units o the set;
a second sub-comparator operable to compare, for each aligned pair, the dictionary sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set to generate a further corresponding plurality of intermediate comparison scores representative of the similarities between said dictionary sequence sub-word unit and the respective sub-word units of the set; and
a calculator operable to calculate, for each aligned pair, a comparison score representative of the similarity between the sub-word units of the aligned pair by combining the pluralities of intermediate comparison scores generated by said first and second sub-comparators, to provide said set of comparison scores. - View Dependent Claims (46, 47, 48, 55, 56, 57, 58, 59, 60, 61)
a determiner operable to determine if the current dictionary sequence of sub-word units was generated from an audio input or a typed input and to output a determination result; and
a selector operable to select for the current dictionary sub-word sequence, the mode of operation of said comparison score generator in dependence upon said determination result.
-
-
56. An apparatus according to claim 55, wherein said comparison score generator includes a third sub-comparator operable to compare, for each aligned pair, the recognised sequence sub-word unit in the aligned pair with the dictionary sequence sub-word unit in the aligned pair by calculating:
-
57. An apparatus according to claim 56, wherein the selector is operable to select said first comparison mode of said comparison score generator when said determiner determines that the current dictionary sequence of sub-word units was generated from an audio input and to select said second comparison mode of said comparison score generator when said determiner determines that the current dictionary sequence of sub-word units was generated from a typed input.
-
58. An apparatus according to claim 45, wherein each of the sub-word units in said dictionary and recognised sequences of sub-word units belong to said set of predetermined sub-word units and wherein said first and second sub-comparators are operable to provide said comparison scores using predetermined data which relate the sub-word units in said set to each other.
-
59. An apparatus according to claim 58, wherein said predetermined data comprises, for each sub-word unit in the set of sub-word units, a probability for confusing that sub-word unit with each of the other sub-word units in the set of sub-word units.
-
60. An apparatus according to claim 45 , wherein said sub-word unit aligner comprises a dynamic programming aligner operable to align said dictionary and recognised sequences of sub-word units using a dynamic programming technique.
-
61. An apparatus according to claim 45, wherein each of said sub-word units represents a phoneme.
-
62. A speech recognition system comprising:
-
a speech signal receiver operable to receive speech signals to be recognised;
a sub-word unit model store operable to store sub-word unit models;
a sub-word unit sequence generator operable to compare the received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
a word dictionary relating sequences of sub-word units to words; and
a word decoder for processing the one or more sequences of sub-word units generated by said sub-word unit sequence generator using the word dictionary to generate one or more words corresponding to the received speech signals;
wherein said word decoder comprises;
a first receiver operable to receive the recognised sequence of sub-word units representative of the received speech signals;
a second receiver operable to receive from said word dictionary a plurality of dictionary sub-word sequences, each representative of one or more known words;
a comparison score generator operable to compare sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison scores;
a similarity measure generator operable to combine the comparison scores obtained by comparing the sub-word units of the recognised sequence with the sub-word units of each received dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and each received dictionary sequence; and
a word identifier operable to identify said one or more words using the similarity measures provided by the similarity measure generator; and
wherein said comparison score generator comprises;
a sub-word unit aligner operable to align sub-word units of the recognised sequence with sub-word units of each received dictionary sequence to form a number of aligned pairs of sub-word units;
a first sub-comparator operable to compare for each aligned pair, the recognised sequence sub-word unit in the aligned pair with each of a plurality of sub-word units taken from a set of predetermined sub-word units, to generate a corresponding plurality of intermediate comparison scores representative of the similarities between the recognised sequence sub-word unit and the respective sub-word units of the set;
a second sub-comparator operable to compare for each aligned pair, the dictionary sequence sub-word unit in the aligned pair with each of said plurality of sub-word units from the set to generate a further corresponding plurality of intermediate comparison scores representative of the similarities between said dictionary sequence sub-word unit and the respective sub-word units of the set; and
a calculator operable to calculate, for each aligned pair, a comparison score representative of the similarity between the sub-word units of the aligned pair by combining the pluralities of intermediate comparison scores generated by said first and second sub-comparators, to provide said set of comparison scores.
-
-
63. An apparatus for identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the apparatus comprising:
-
a first receiver operable to receive the recognised sequence of sub-word units representative of the one or more words to be identified;
a second receiver operable to receive a plurality of dictionary sub-word sequences, each representative of one or more known words;
a comparison result generator operable to compare sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison results;
a similarity measure generator operable to combine the comparison results obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
a word identifier operable to identify said one or more words using the similarity measures provided by the similarity measure generator;
wherein said comparison result generator has a plurality of different comparison modes of operation and wherein the apparatus further comprises;
a determiner operable to determine if a current dictionary sequence of sub-word units was generated from an audio input or a typed input and to output a determination result; and
a selector operable to select, for the current dictionary sub-word sequence, the mode of operation of said comparing means in dependence upon said determination result. - View Dependent Claims (64, 65)
-
-
66. A speech recognition system comprising:
-
means for receiving speech signals to be recognised;
means for storing sub-word unit models;
means for comparing the received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
a word dictionary relating sequences of sub-word units to words; and
a word decoder for processing the one or more sequences of sub-word units output by said comparing means using the word dictionary to generate one or more words corresponding to the received speech signals;
wherein said word decoder comprises;
first receiving means for receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
second receiving means for receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
means for comparing sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison results;
means for combining the comparison results obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
means for identifying said one or more words using the similarity measures provided by the combining means;
wherein said comparing means has a plurality of different comparison modes of operation and wherein the word decoder further comprises;
means for determining if a current dictionary sequence of sub-word units was generated from an audio input or a typed input and to output a determination result; and
means for selecting, for the current dictionary sub-word sequence, the mode of operation of said comparing means in dependence upon said determination result.
-
-
67. A speech recognition system comprising:
-
a speech signal receiver operable to receive speech signals to be recognised;
a sub-word unit model store operable to store sub-word unit models;
a sub-word unit sequence generator operable to compare the received speech signals with the sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
a word dictionary relating sequences of sub-word units to words; and
a word decoder for processing the one or more sequences of sub-word units output by said sub-word unit sequence generator using the word dictionary to generate one or more words corresponding to the received speech signals;
wherein said word decoder comprises;
a first receiver operable to receive the recognised sequence of sub-word units representative of the one or more words to be identified;
a second receiver operable to receive a plurality of dictionary sub-word sequences, each representative of one or more known words;
a comparison result generator operable to compare sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison results;
a similarity measure generator operable to combine the comparison results obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
a word identifier operable to identify said one or more words using the similarity measures provided by the similarity measure generator;
wherein said comparison result generator has a plurality of different comparison modes of operation and wherein the word decoder further comprises;
a determiner operable to determine if a current dictionary sequence of sub-word units was generated from an audio input or a typed input and for outputting a determination result; and
a selector operable to select, for the current dictionary sub-word sequence, the mode of operation of said comparing means in dependence upon said determination result.
-
-
68. A speech recognition method comprising:
-
receiving speech signals to be recognised;
comparing the received speech signals with stored sub-word unit models to generate one or more sequences of sub-word units representative of the received speech signals;
processing the one or more sequences of sub-word units output by said comparing means using a stored word dictionary to generate one or more words corresponding to the received speech signals;
wherein said processing step comprises;
a first receiving step of receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
a second receiving step of receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
comparing, using a sub-word comparator, sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison results;
combining the comparison results obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
identifying said one or more words using the similarity measures provided by the combining step;
wherein said sub-word comparator has a plurality of different comparison modes of operation and wherein the processing step further comprises;
determining if a current dictionary sequence of sub-word units was generated from an audio input or a typed input and outputting a determination result; and
selecting, for the current dictionary sub-word sequence, the mode of operation of said sub-word comparator in dependence upon said determination result.
-
-
69. A storage medium storing processor implementable instructions for controlling a processor to implement a method of identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the processor implementable instructions comprising:
-
instructions for a first receiving step of receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
instructions for a second receiving step of receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
instructions for comparing, using a sub-word comparator, sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison results;
instructions for combining the comparison results obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
instructions for identifying said one or more words using the similarity measures provided by the combining step;
wherein said sub-word comparator has a plurality of different comparison modes of operation and wherein said instructions for said comparing step comprise;
instructions for determining if a current dictionary sequence of sub-word units was generated from an audio input or a typed input and outputting a determination result; and
instructions for selecting, for the current dictionary sub-word sequence, the mode of operation of said sub-word unit comparator in dependence upon said determination result.
-
-
70. Processor implementable instructions for controlling a processor to implement a method of identifying one or more words corresponding to a sequence of sub-word units output by a recognition system in response to a rendition of the one or more words, the processor implementable instructions comprising:
-
instructions for a first receiving step of receiving the recognised sequence of sub-word units representative of the one or more words to be identified;
instructions for a second receiving step of receiving a plurality of dictionary sub-word sequences, each representative of one or more known words;
instructions for comparing, using a sub-word comparator, sub-word units of the recognised sequence with sub-word units of each dictionary sequence to provide a set of comparison results;
instructions for combining the comparison results obtained by comparing the sub-word units of the recognised sequence with the sub-word units of the same dictionary sequence to provide, for each dictionary sequence, a measure of the similarity between the recognised sequence and the dictionary sequence; and
instructions for identifying said one or more words using the similarity measures provided by the combining step;
wherein said sub-word comparator has a plurality of different comparison modes of operation and wherein said instructions for said comparing step comprise;
instructions for determining if a current dictionary sequence of sub-word units was generated from an audio input or a typed input and outputting a determination result; and
instructions for selecting, for the current dictionary sub-word sequence, the mode of operation of said sub-word unit comparator in dependence upon said determination result.
-
Specification