Method and apparatus for constructing and using syllable-like unit language models

US 7,676,365 B2
Filed: 04/20/2005
Issued: 03/09/2010
Est. Priority Date: 12/26/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition system having a language model generated through a process comprising:

a processor breaking each word in a dictionary into units wherein breaking each word into units comprises;

breaking each word in the dictionary into initial units by dividing each word into the largest units possible that each include at most one vowel sound;

for each initial unit, setting a frequency for the initial unit by summing the unigram probabilities of the words in which the initial unit was identified;

breaking at least one of the initial units into smaller units by preferring smaller units that occur more frequently in the dictionary over smaller units that occur less frequently and by preferring smaller units that group together sequences of phonetic units that appear in a word and where each sequence of phonetic units comprises phonetic units that the speech recognition system typically fails to recognize individually;

for each word, grouping the smaller units of the word into n-grams;

counting the total number of n-gram occurences in the dictionary; and

for each n-gram, counting the number of occurences of the n-gram in the dictionary and dividing this count by the total number of n-gram occurences to form a language model probability for the n-gram.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and computer-readable medium use syllable-like units (SLUs) to decode a pronunciation into a phonetic description. The syllable-like units are generally larger than a single phoneme but smaller than a word. The present invention provides a means for defining these syllable-like units and for generating a language model based on these syllable-like units that can be used in the decoding process. As SLUs are longer than phonemes, they contain more acoustic contextual clues and better lexical constraints for speech recognition. Thus, the phoneme accuracy produced from SLU recognition is much better than all-phone sequence recognition.

369 Citations

5 Claims

1. A speech recognition system having a language model generated through a process comprising:
- a processor breaking each word in a dictionary into units wherein breaking each word into units comprises;
  
  breaking each word in the dictionary into initial units by dividing each word into the largest units possible that each include at most one vowel sound;
  
  for each initial unit, setting a frequency for the initial unit by summing the unigram probabilities of the words in which the initial unit was identified;
  
  breaking at least one of the initial units into smaller units by preferring smaller units that occur more frequently in the dictionary over smaller units that occur less frequently and by preferring smaller units that group together sequences of phonetic units that appear in a word and where each sequence of phonetic units comprises phonetic units that the speech recognition system typically fails to recognize individually;
  
  for each word, grouping the smaller units of the word into n-grams;
  
  counting the total number of n-gram occurences in the dictionary; and
  
  for each n-gram, counting the number of occurences of the n-gram in the dictionary and dividing this count by the total number of n-gram occurences to form a language model probability for the n-gram.
- View Dependent Claims (2, 3)
- - 2. The speech recognition system of claim 1 wherein breaking each word further comprises updating the frequencies of the smaller units into which the word is broken.
  - 3. The speech recognition system of claim 2 wherein the frequency of a smaller unit is calculated based on language model probabilities for words in which the smaller unit appears.

4. A method comprising:
- for each word in a dictionary of words, dividing the word into units to produce a first set of units for each word, wherein for at least one word, dividing the word into units comprises dividing the word into units smaller than the word and wherein dividing the word into units comprises dividing the word into the largest units possible that each include at most one vowel sound;
  
  a processor setting a frequency of each unit by summing unigram probabilities of the words in which the unit appears, wherein each unigram probability comprises the probability of the word appearing in a corpus of text;
  
  a processor applying a constraint to the units to identify at least one unit , wherein the constraint requires that each unit have fewer than a selected number of phonemes and wherein the unit is identified because it has at least the selected number of phonemes;
  
  for at least one word in the dictionary of words, dividing the word into units to form a second set of units by dividing into smaller units at least one unit of the first set of units for the word such that none of the words in the dictionary is divided into units that include the identified unit;
  
  transforming the units of each word into a set of n-grams for each word; and
  
  forming a language model by determining frequency counts for each n-gram in the sets of n-grams for the words in the dictionary of words.
- View Dependent Claims (5)
- - 5. The method of claim 4 wherein the constraint comprises requiring that the unit consist of fewer than five phonemes.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Weiss, Rebecca C., Hwang, Mei-Yuh, Alleva, Fileno A.
Primary Examiner(s)
Armstrong; Angela A

Application Number

US11/110,602
Publication Number

US 20050187769A1
Time in Patent Office

1,784 Days
Field of Search

704/10, 704/243, 704/240
US Class Current

704/240
CPC Class Codes

G10L 15/063 Training

G10L 2015/0636 Threshold criteria for the ...

Method and apparatus for constructing and using syllable-like unit language models

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

369 Citations

5 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for constructing and using syllable-like unit language models

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

369 Citations

5 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links