Method and apparatus for providing unsupervised adaptation of transcriptions

US 6,208,964 B1
Filed: 08/31/1998
Issued: 03/27/2001
Est. Priority Date: 08/31/1998
Status: Expired due to Term

First Claim

Patent Images

1. A method for adapting the transcription content of a speech recognition dictionary, the speech recognition dictionary including a plurality of dictionary transcription groups, each dictionary transcription group being associated to a respective word, said method comprising:

a) providing a computer readable storage medium containing a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, each orthographic group being associated with a respective label data element allowing to extract the orthographic group on the basis of the label data element, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;

b) receiving a signal derived from a spoken utterance indicative of a certain word;

c) receiving a label element from a speech recognizer processing said signal indicative of the certain word;

d) retrieving from the transcriptions bank an orthographic group corresponding to the certain word on a basis of said label element;

e) processing said signal to select a certain transcription from said orthographic group retrieved in d), said certain transcription being a most likely match to the spoken utterance conveyed by the signal;

f) entering the certain transcription in a given dictionary transcription group, said given dictionary transcription group corresponding to the certain word selected from the speech recognition dictionary as being the most likely match to the spoken utterance conveyed by the signal received at the input.

View all claims

18 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An adaptive speech recognition system is provided including an input for receiving a signal derived from a spoken utterance indicative of a certain vocabulary item, a speech recognition dictionary, a speech recognition unit and an adaptation module. The speech recognition dictionary has a plurality of vocabulary items each being associated to a respective dictionary transcription group. The speech recognition unit is in an operative relationship with the speech recognition dictionary and selects a certain vocabulary item from the speech recognition dictionary as being a likely match to the signal received at the input. The results of the speech recognition process are provided to the adaptation module. The adaptation module includes a transcriptions bank having a plurality of orthographic groups, each including a plurality of transcriptions associated with a common vocabulary item. A transcription selector module in the adaptation module retrieves a given orthographic group from the transcriptions bank on a basis of the vocabulary item recognized by the speech recognition unit. The transcription selector module processes the given orthographic group on the basis of the signal received at the input to select a certain transcription from the transcriptions bank. The adaptation module then modifies a dictionary transcription group corresponding to the vocabulary item selected as being a likely match to the signal received at the input on the basis of the selected certain transcription.

360 Citations

23 Claims

1. A method for adapting the transcription content of a speech recognition dictionary, the speech recognition dictionary including a plurality of dictionary transcription groups, each dictionary transcription group being associated to a respective word, said method comprising:
- a) providing a computer readable storage medium containing a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, each orthographic group being associated with a respective label data element allowing to extract the orthographic group on the basis of the label data element, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;
  
  b) receiving a signal derived from a spoken utterance indicative of a certain word;
  
  c) receiving a label element from a speech recognizer processing said signal indicative of the certain word;
  
  d) retrieving from the transcriptions bank an orthographic group corresponding to the certain word on a basis of said label element;
  
  e) processing said signal to select a certain transcription from said orthographic group retrieved in d), said certain transcription being a most likely match to the spoken utterance conveyed by the signal;
  
  f) entering the certain transcription in a given dictionary transcription group, said given dictionary transcription group corresponding to the certain word selected from the speech recognition dictionary as being the most likely match to the spoken utterance conveyed by the signal received at the input.
- View Dependent Claims (2, 3, 4, 5)
- - 2. A method as defined in claim 1, further comprising:
3. A method as defined in claim 1, said method further comprising scoring the transcriptions in the orthographic group corresponding to the certain word to select the certain transcription.
4. A method as defined in claim 1, further comprising the step of augmenting said transcriptions bank in said computer readable medium on the basis of said signal derived from a spoken utterance indicative of a certain word.
5. A method as defined in claim 4, wherein said augmenting step includes the steps of:
- a) generating a transcription on the basis of said signal derived from a spoken utterance indicative of a certain word;
  
  b) adding the transcription generated in a) to the transcriptions bank in the orthographic group corresponding to the certain word.

6. An adaptive speech recognition system, said system comprising:
- an input for receiving a signal derived from a spoken utterance indicative of a certain vocabulary item;
  
  a speech recognition dictionary comprising a plurality of vocabulary items potentially recognizable on a basis of a spoken utterance, each vocabulary item being associated to a respective dictionary transcription group;
  
  a speech recognition unit in an operative relationship with said speech recognition dictionary, said speech recognition unit being operative for selecting on a basis of the signal received at the input a certain vocabulary item from said speech recognition dictionary as a likely snatch to the spoken utterance conveyed by signal received at the input;
  
  an adaptation module in operative relationship with said speech recognition dictionary, said adaptation module including;
  
  a) a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common vocabulary item, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;
  
  b) a transcription selector module operative for;
  
  1) retrieving a given orthographic group from the transcriptions bank on a basis of the vocabulary item selected as being a likely match to the signal received at the input;
  
  2) processing the given orthographic group retrieved in
  
  1) on the basis of the signal received at the input to select a certain transcription from the given orthographic group that is a most likely match to the spoken utterance conveyed by the signal at said input;
  
  3) entering in a dictionary transcription group corresponding to the vocabulary item selected as being a likely match to the spoken utterance conveyed by the signal received at the input the certain transcription selected in
  
  2).
- View Dependent Claims (7, 8, 9, 10, 11)
- - 7. An adaptive speech recognition system as defined in claim 6, wherein said transcription selector module is operative for adding the certain transcription in the dictionary transcription group corresponding to the vocabulary item selected from the speech recognition dictionary as being a likely match to the spoken utterance conveyed by the signal received at the input.
  - 8. An adaptive speech recognition system as defined in claim 6, wherein said transcription selector module is operative for substituting with the certain transcription a transcription in the dictionary transcription group corresponding to the vocabulary item vocabulary item selected from the speech recognition dictionary as being a likely match to the spoken utterance conveyed by the signal received at the input.
  - 9. An adaptive speech recognition system as defined in claim 6, wherein said transcription selector module is operative for scoring the transcriptions in the given orthographic group to select the certain transcription.
  - 10. An adaptive speech recognition system as defined in claim 9, wherein said adaptation module further comprises an augmentor unit for adding transcriptions derived from the signal received at the input to the transcriptions bank.
  - 11. An adaptive speech recognition system as defined in claim 10, wherein said augmentor unit comprises:

12. An apparatus suitable for use in adapting the transcription content of a speech recognition dictionary, the speech recognition dictionary including a plurality of dictionary transcription groups, each dictionary transcription group being associated to a respective word, said apparatus comprising:
- a computer readable storage medium containing a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, each orthographic group being associated with a respective label data element allowing to extract the orthographic group from the transcriptions bank on the basis of the label data element, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;
  
  an input for receiving;
  
  a) a signal derived from a spoken utterance indicative of a certain word;
  
  b) a label element from a speech recognizer processing said signal indicative of the certain word;
  
  a transcription selector module for;
  
  a) retrieving from the transcriptions bank an orthographic group corresponding to the certain word on a basis of said label element;
  
  b) processing said signal to select a certain transcription from said orthographic group retrieved in a), said certain transcription being a most likely match to the spoken utterance conveyed by the signal received at the input;
  
  an output suitable for releasing a signal representative of the certain transcription for insertion in a given dictionary transcription group, said given dictionary transcription group corresponding to the certain word selected from the speech recognition dictionary as being a likely match to the signal received at the input.
- View Dependent Claims (13, 14, 15, 16)
- - 13. An apparatus as defined in claim 12, wherein said transcription selector module is operative for processing the orthographic group retrieved in a) on the basis of the signal received at the input to select a set of transcriptions from the given orthographic group, the output being operative for releasing a signal representative of the set of transcriptions for insertion in the given dictionary transcription group.
  - 14. An apparatus as defined in claim 12, wherein said transcription selection module is further operative for scoring the transcriptions in the orthographic group corresponding to the certain word to select the certain transcription.
  - 15. An apparatus as defined in claim 12, further comprising an augmentor unit for modifying the transcriptions bank in said computer readable medium on the basis of the signal derived from the spoken utterance indicative of the certain word.
  - 16. An apparatus as defined in claim 15, wherein said augmentor unit comprises:

17. An apparatus for modifying a transcriptions bank on the basis of a signal representative of audio information, the transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, at least some of the transcriptions of the plurality of transcriptions being derived on the basis of a text to transcription module, said apparatus comprising:
- an input for receiving a signal derived from a spoken utterance, said signal having been processed by a speech recognizer and found to be indicative of a certain word;
  
  a continuous allophone recognizer unit for processing the signal received at the input to derive a transcription corresponding to the signal representative of audio information;
  
  a processing unit for selectively adding the transcription generated by said continuous allophone recognizer unit to the orthographic group corresponding to the certain word on the basis of predetermined selection rules.
- View Dependent Claims (18)
- - 18. An apparatus as defined in claim 17, wherein said transcription is of a type selected from the set consisting of a phonemic transcription and an allophonic transcription.

19. A computer readable storage medium containing a program element to direct a computer to adapt the transcription content of a speech recognition dictionary, the speech recognition dictionary including a plurality of dictionary transcription groups, each dictionary transcription group being associated to a respective word, the computer including:
- memory unit including;
  
  a) a transcriptions bank, said transcriptions bank comprising a plurality of orthographic groups, each orthographic group including a plurality of transcriptions associated with a common word, each orthographic group being associated with a respective label data element allowing to extract the orthographic group on the basis of the label data element, each orthographic group in the transcriptions bank corresponding to a respective dictionary transcription group;
  
  a processor in operative relationship with said memory unit, said program element instructing said processor to implement functional blocks for;
  
  a) receiving a signal derived from a spoken utterance indicative of a certain word;
  
  b) receiving a label element from a speech recognizer processing said signal indicative of the certain word;
  
  c) retrieving from the transcriptions bank an orthographic group corresponding to the certain word on a basis of said label element;
  
  d) processing said signal to select a certain transcription from said orthographic group retrieved in c), said certain transcription being a most likely match to the spoken utterance conveyed by the signal;
  
  e) inserting the certain transcription in a given dictionary transcription group, said given dictionary transcription group corresponding to the certain word selected from the speech recognition dictionary as being a likely match to the siqnal received at the input.
- View Dependent Claims (20, 21, 22, 23)
- - 20. A computer readable medium as defined in claim 19, wherein said program element further instructs said processor to implement functional blocks for;
21. A computer readable medium as defined in claim 19, wherein said program element instructs said processor to score the transcriptions in the orthographic group corresponding to the certain word to select the certain transcription.
22. A computer readable medium as defined in claim 19, wherein said program element instructs said processor to implement a functional block for augmenting said transcriptions bank on the basis of said signal.
23. A computer readable medium as defined in claim 22, wherein said functional block for augmenting said transcriptions bank is operative for:
- a) generating a transcription on the basis of said signal derived from a spoken utterance indicative of a certain word;
  
  b) adding the transcription generated in a) to the transcriptions bank in the orthographic group corresponding to the certain word.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Avaya Incorporated
Original Assignee
Nortel Networks Limited (Nortel Networks Corporation)
Inventors
Sabourin, Michael
Primary Examiner(s)
{haeck over (S)}mits, Ta̅livaldis I.

Application Number

US09/144,065
Time in Patent Office

939 Days
Field of Search

704/235, 704/251, 704/254, 704/243, 704/244
US Class Current

704/244
CPC Class Codes

G10L 15/063 Training

G10L 2015/0631 Creating reference template...

Method and apparatus for providing unsupervised adaptation of transcriptions

First Claim

18 Assignments

0 Petitions

Accused Products

Abstract

360 Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for providing unsupervised adaptation of transcriptions

First Claim

18 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

360 Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links