Position-dependent phonetic models for reliable pronunciation identification
First Claim
Patent Images
1. A method comprising:
- receiving a representation of a speech signal;
a processor decoding the representation of the speech signal into a sequence of words using a word language model and an acoustic model;
a processor converting each word in the sequence of words into a sequence of position-dependent phonetic tokens using a phonetic token lexicon, which provides a position-dependent phonetic token description of each word in a lexicon, wherein each position-dependent phonetic token comprises a phone and a position indicator that indicates the position of the phone within a syllable; and
a processor determining probabilities for sub-sequences in the sequences of position-dependent phonetic tokens by applying sub-sequences of position-dependent phonetic tokens converted from the sequence of words to a position-dependent phonetic language model that describes probabilities of sequences of position-dependent phonetic tokens comprising a conditional probability of a position-dependent phonetic token given at least two preceding position-dependent phonetic tokens and probabilities of individual position-dependent phonetic tokens.
2 Assignments
0 Petitions
Accused Products
Abstract
A representation of a speech signal is received and is decoded to identify a sequence of position-dependent phonetic tokens wherein each token comprises a phone and a position indicator that indicates the position of the phone within a syllable.
30 Citations
5 Claims
-
1. A method comprising:
-
receiving a representation of a speech signal; a processor decoding the representation of the speech signal into a sequence of words using a word language model and an acoustic model; a processor converting each word in the sequence of words into a sequence of position-dependent phonetic tokens using a phonetic token lexicon, which provides a position-dependent phonetic token description of each word in a lexicon, wherein each position-dependent phonetic token comprises a phone and a position indicator that indicates the position of the phone within a syllable; and a processor determining probabilities for sub-sequences in the sequences of position-dependent phonetic tokens by applying sub-sequences of position-dependent phonetic tokens converted from the sequence of words to a position-dependent phonetic language model that describes probabilities of sequences of position-dependent phonetic tokens comprising a conditional probability of a position-dependent phonetic token given at least two preceding position-dependent phonetic tokens and probabilities of individual position-dependent phonetic tokens. - View Dependent Claims (2, 3)
-
-
4. A hardware computer storage medium encoded with a computer program, causing the computer to execute steps comprising:
-
decoding a representation of a speech signal using a position-dependent phonetic language model that provides probabilities of sequences of position-dependent phonetic tokens wherein each position-dependent phonetic token comprises a phone and a position identifier that identifies a position within a syllable, wherein decoding produces at least one sequence of position-dependent phonetic tokens; receiving a symbol represented by a portion of the speech signal, wherein the symbol is not part of the language of a lexicon; and annotating a lexicon by storing a sequence of position-dependent phonetic tokens as a pronunciation for the symbol that is not part of the language of the lexicon, wherein annotating the lexicon further comprises identifying syllable boundaries from the position-dependent phonetic tokens and placing syllable boundaries in the lexicon. - View Dependent Claims (5)
-
Specification