Method and apparatus for providing unsupervised adaptation of transcriptions
First Claim
1. A method for adapting the transcription content of a speech recognition dictionary, the speech recognition dictionary including a plurality of dictionary transcription groups, each dictionary transcription group being associated to a respective word, said method comprising:
- a) providing a computer readable storage medium containing a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, each orthographic group being associated with a respective label data element allowing to extract the orthographic group on the basis of the label data element, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;
b) receiving a signal derived from a spoken utterance indicative of a certain word;
c) receiving a label element from a speech recognizer processing said signal indicative of the certain word;
d) retrieving from the transcriptions bank an orthographic group corresponding to the certain word on a basis of said label element;
e) processing said signal to select a certain transcription from said orthographic group retrieved in d), said certain transcription being a most likely match to the spoken utterance conveyed by the signal;
f) entering the certain transcription in a given dictionary transcription group, said given dictionary transcription group corresponding to the certain word selected from the speech recognition dictionary as being the most likely match to the spoken utterance conveyed by the signal received at the input.
18 Assignments
0 Petitions
Accused Products
Abstract
An adaptive speech recognition system is provided including an input for receiving a signal derived from a spoken utterance indicative of a certain vocabulary item, a speech recognition dictionary, a speech recognition unit and an adaptation module. The speech recognition dictionary has a plurality of vocabulary items each being associated to a respective dictionary transcription group. The speech recognition unit is in an operative relationship with the speech recognition dictionary and selects a certain vocabulary item from the speech recognition dictionary as being a likely match to the signal received at the input. The results of the speech recognition process are provided to the adaptation module. The adaptation module includes a transcriptions bank having a plurality of orthographic groups, each including a plurality of transcriptions associated with a common vocabulary item. A transcription selector module in the adaptation module retrieves a given orthographic group from the transcriptions bank on a basis of the vocabulary item recognized by the speech recognition unit. The transcription selector module processes the given orthographic group on the basis of the signal received at the input to select a certain transcription from the transcriptions bank. The adaptation module then modifies a dictionary transcription group corresponding to the vocabulary item selected as being a likely match to the signal received at the input on the basis of the selected certain transcription.
360 Citations
23 Claims
-
1. A method for adapting the transcription content of a speech recognition dictionary, the speech recognition dictionary including a plurality of dictionary transcription groups, each dictionary transcription group being associated to a respective word, said method comprising:
-
a) providing a computer readable storage medium containing a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, each orthographic group being associated with a respective label data element allowing to extract the orthographic group on the basis of the label data element, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;
b) receiving a signal derived from a spoken utterance indicative of a certain word;
c) receiving a label element from a speech recognizer processing said signal indicative of the certain word;
d) retrieving from the transcriptions bank an orthographic group corresponding to the certain word on a basis of said label element;
e) processing said signal to select a certain transcription from said orthographic group retrieved in d), said certain transcription being a most likely match to the spoken utterance conveyed by the signal;
f) entering the certain transcription in a given dictionary transcription group, said given dictionary transcription group corresponding to the certain word selected from the speech recognition dictionary as being the most likely match to the spoken utterance conveyed by the signal received at the input. - View Dependent Claims (2, 3, 4, 5)
processing the orthographic group corresponding to the certain word on the basis of the signal received at the input to select a set of transcriptions;
inserting the set of transcriptions in the given dictionary transcription group.
-
-
3. A method as defined in claim 1, said method further comprising scoring the transcriptions in the orthographic group corresponding to the certain word to select the certain transcription.
-
4. A method as defined in claim 1, further comprising the step of augmenting said transcriptions bank in said computer readable medium on the basis of said signal derived from a spoken utterance indicative of a certain word.
-
5. A method as defined in claim 4, wherein said augmenting step includes the steps of:
-
a) generating a transcription on the basis of said signal derived from a spoken utterance indicative of a certain word;
b) adding the transcription generated in a) to the transcriptions bank in the orthographic group corresponding to the certain word.
-
-
6. An adaptive speech recognition system, said system comprising:
-
an input for receiving a signal derived from a spoken utterance indicative of a certain vocabulary item;
a speech recognition dictionary comprising a plurality of vocabulary items potentially recognizable on a basis of a spoken utterance, each vocabulary item being associated to a respective dictionary transcription group;
a speech recognition unit in an operative relationship with said speech recognition dictionary, said speech recognition unit being operative for selecting on a basis of the signal received at the input a certain vocabulary item from said speech recognition dictionary as a likely snatch to the spoken utterance conveyed by signal received at the input;
an adaptation module in operative relationship with said speech recognition dictionary, said adaptation module including;
a) a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common vocabulary item, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;
b) a transcription selector module operative for;
1) retrieving a given orthographic group from the transcriptions bank on a basis of the vocabulary item selected as being a likely match to the signal received at the input;
2) processing the given orthographic group retrieved in
1) on the basis of the signal received at the input to select a certain transcription from the given orthographic group that is a most likely match to the spoken utterance conveyed by the signal at said input;
3) entering in a dictionary transcription group corresponding to the vocabulary item selected as being a likely match to the spoken utterance conveyed by the signal received at the input the certain transcription selected in
2).- View Dependent Claims (7, 8, 9, 10, 11)
a continuous allophone recognizer unit generating a transcription on the basis of said signal derived from a spoken utterance indicative of a certain vocabulary item;
means for selectively adding the transcription generated by said continuous allophone recognizer to the orthographic group corresponding to the certain vocabulary item in the transcriptions bank.
-
-
12. An apparatus suitable for use in adapting the transcription content of a speech recognition dictionary, the speech recognition dictionary including a plurality of dictionary transcription groups, each dictionary transcription group being associated to a respective word, said apparatus comprising:
-
a computer readable storage medium containing a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, each orthographic group being associated with a respective label data element allowing to extract the orthographic group from the transcriptions bank on the basis of the label data element, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;
an input for receiving;
a) a signal derived from a spoken utterance indicative of a certain word;
b) a label element from a speech recognizer processing said signal indicative of the certain word;
a transcription selector module for;
a) retrieving from the transcriptions bank an orthographic group corresponding to the certain word on a basis of said label element;
b) processing said signal to select a certain transcription from said orthographic group retrieved in a), said certain transcription being a most likely match to the spoken utterance conveyed by the signal received at the input;
an output suitable for releasing a signal representative of the certain transcription for insertion in a given dictionary transcription group, said given dictionary transcription group corresponding to the certain word selected from the speech recognition dictionary as being a likely match to the signal received at the input. - View Dependent Claims (13, 14, 15, 16)
a continuous allophone recognizer unit for generating transcriptions on the basis of said signal derived front the spoken utterance indicative of the certain word;
a means for selectively adding the transcriptions generated by said continuous allophone recognizer to the orthographic group corresponding to the certain word.
-
-
17. An apparatus for modifying a transcriptions bank on the basis of a signal representative of audio information, the transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, at least some of the transcriptions of the plurality of transcriptions being derived on the basis of a text to transcription module, said apparatus comprising:
-
an input for receiving a signal derived from a spoken utterance, said signal having been processed by a speech recognizer and found to be indicative of a certain word;
a continuous allophone recognizer unit for processing the signal received at the input to derive a transcription corresponding to the signal representative of audio information;
a processing unit for selectively adding the transcription generated by said continuous allophone recognizer unit to the orthographic group corresponding to the certain word on the basis of predetermined selection rules. - View Dependent Claims (18)
-
-
19. A computer readable storage medium containing a program element to direct a computer to adapt the transcription content of a speech recognition dictionary, the speech recognition dictionary including a plurality of dictionary transcription groups, each dictionary transcription group being associated to a respective word, the computer including:
-
memory unit including;
a) a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, each orthographic group being associated with a respective label data element allowing to extract the orthographic group on the basis of the label data element, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;
a processor in operative relationship with said memory unit, said program element instructing said processor to implement functional blocks for;
a) receiving a signal derived from a spoken utterance indicative of a certain word;
b) receiving a label element from a speech recognizer processing said signal indicative of the certain word;
c) retrieving from the transcriptions bank an orthographic group corresponding to the certain word on a basis of said label element;
d) processing said signal to select a certain transcription from said orthographic group retrieved in c), said certain transcription being a most likely match to the spoken utterance conveyed by the signal;
e) inserting the certain transcription in a given dictionary transcription group, said given dictionary transcription group corresponding to the certain word selected from the speech recognition dictionary as being a likely match to the siqnal received at the input. - View Dependent Claims (20, 21, 22, 23)
processing the orthographic group corresponding to the certain word on the basis of the signal derived from a spoken utterance indicative of a certain word to select a set of transcriptions;
inserting the set of transcriptions in the given dictionary transcription group.
-
-
21. A computer readable medium as defined in claim 19, wherein said program element instructs said processor to score the transcriptions in the orthographic group corresponding to the certain word to select the certain transcription.
-
22. A computer readable medium as defined in claim 19, wherein said program element instructs said processor to implement a functional block for augmenting said transcriptions bank on the basis of said signal.
-
23. A computer readable medium as defined in claim 22, wherein said functional block for augmenting said transcriptions bank is operative for:
-
a) generating a transcription on the basis of said signal derived from a spoken utterance indicative of a certain word;
b) adding the transcription generated in a) to the transcriptions bank in the orthographic group corresponding to the certain word.
-
Specification