System and methods for acoustic and language modeling for automatic speech recognition with large vocabularies
First Claim
1. A method of analyzing a language for providing speech recognition, the method comprising steps of:
- determining a threshold frequency of occurrence, within a corpus, of word forms in a vocabulary V for the language, by using at least one processor;
in response to determining that a subset of the word forms has a frequency of occurrence in the corpus less than the threshold frequency, splitting at least some of the word forms in the subset to generate word form components, at least some of the word form components not being full words;
generating a language component vocabulary VC comprising the word forms in the vocabulary V and the word form components; and
generating and storing information indicating a correspondence between the word forms in the vocabulary V and corresponding word form components.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for generating a language component vocabulary VC for a speech recognition system having a language vocabulary V of a plurality of word forms is disclosed. The method includes: partitioning the language vocabulary V into subsets of word forms based on frequencies of occurrence of the respective word forms; and in at least one of the subsets, splitting word forms having frequencies less than a threshold to thereby generate word form components. Also disclosed is a method for use in speech recognition including: splitting an acoustic vocabulary comprising baseforms into baseform components and storing the baseform components; and, performing sound to spelling mapping on the baseform components so as to generate a baseform components to word parts table for use in subsequent decoding of speech. A method for decoding a speech utterance using language model components and acoustic components, includes the steps of: generating from the utterance a stack of baseform component paths; concatenating baseform components in a path to generate concatenated baseforms, when the concatenated baseform components correspond to a baseform found in an acoustic vocabulary; mapping the concatenated baseforms into words; computing language model (LM) scores associated with the words using a language model, and performing further decoding of the utterance based thereupon.
17 Citations
20 Claims
-
1. A method of analyzing a language for providing speech recognition, the method comprising steps of:
-
determining a threshold frequency of occurrence, within a corpus, of word forms in a vocabulary V for the language, by using at least one processor; in response to determining that a subset of the word forms has a frequency of occurrence in the corpus less than the threshold frequency, splitting at least some of the word forms in the subset to generate word form components, at least some of the word form components not being full words; generating a language component vocabulary VC comprising the word forms in the vocabulary V and the word form components; and generating and storing information indicating a correspondence between the word forms in the vocabulary V and corresponding word form components. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for providing speech recognition, the method comprising steps of:
-
determining a threshold frequency of occurrence, within a corpus, of word forms in a language vocabulary V, by using at least one processor; in response to determining that a subset of the word forms has a frequency of occurrence less than the threshold frequency, splitting at least a portion of the word forms in the subset to generate word form components, at least some of the word form components not being full words; generating a language component vocabulary VC comprising the word forms in the language vocabulary V and the word form components; mapping the language vocabulary V into an acoustic vocabulary comprising baseforms; splitting the acoustic vocabulary into baseform components and storing said baseform components; and performing sound to spelling mapping on said baseform components so as to generate information indicating a correspondence between baseform components and word parts for use in subsequent decoding of speech. - View Dependent Claims (7, 8, 9, 10, 11)
-
-
12. A system for analyzing a language for providing speech recognition, the system comprising:
at least one processor programmed to; determine a threshold frequency of occurrence, within a corpus, of word forms in a language vocabulary V; in response to determining that a subset of the word forms has a frequency of occurrence less than the threshold frequency, split at least some of the word forms in the subset to generate word form components, at least some of the word form components not being full words; generate a language component vocabulary VC comprising the word forms in the language vocabulary V and the word form components; and generate information indicating a correspondence between the word forms in the language vocabulary V and corresponding word form components. - View Dependent Claims (13, 14, 15)
-
16. A system for providing speech recognition, comprising:
- at least one processor programmed to;
determining a threshold frequency of occurrence, within a corpus, of word forms in a language vocabulary V; in response to determining that a subset of the word forms has a frequency of occurrence less than the threshold frequency, split at least some of the word forms in the subset to generate word form components, at least some of the word form components not being full words; generate a language component vocabulary VC comprising the word forms in the language vocabulary V and the word form components; map the language vocabulary V into an acoustic vocabulary comprising baseforms; split the acoustic vocabulary into baseform components and store said baseform components; and perform sound to spelling mapping on said baseform components so as to generate information indicating a correspondence between baseform components and word parts for use in subsequent decoding of speech. - View Dependent Claims (17, 18, 19, 20)
- at least one processor programmed to;
Specification