Position-dependent phonetic models for reliable pronunciation identification

US 8,135,590 B2
Filed: 01/11/2007
Issued: 03/13/2012
Est. Priority Date: 01/11/2007
Status: Expired due to Fees

First Claim

Patent Images

1. A method comprising:

receiving a representation of a speech signal;

a processor decoding the representation of the speech signal into a sequence of words using a word language model and an acoustic model;

a processor converting each word in the sequence of words into a sequence of position-dependent phonetic tokens using a phonetic token lexicon, which provides a position-dependent phonetic token description of each word in a lexicon, wherein each position-dependent phonetic token comprises a phone and a position indicator that indicates the position of the phone within a syllable; and

a processor determining probabilities for sub-sequences in the sequences of position-dependent phonetic tokens by applying sub-sequences of position-dependent phonetic tokens converted from the sequence of words to a position-dependent phonetic language model that describes probabilities of sequences of position-dependent phonetic tokens comprising a conditional probability of a position-dependent phonetic token given at least two preceding position-dependent phonetic tokens and probabilities of individual position-dependent phonetic tokens.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A representation of a speech signal is received and is decoded to identify a sequence of position-dependent phonetic tokens wherein each token comprises a phone and a position indicator that indicates the position of the phone within a syllable.

30 Citations

View as Search Results

5 Claims

1. A method comprising:
- receiving a representation of a speech signal;
  
  a processor decoding the representation of the speech signal into a sequence of words using a word language model and an acoustic model;
  
  a processor converting each word in the sequence of words into a sequence of position-dependent phonetic tokens using a phonetic token lexicon, which provides a position-dependent phonetic token description of each word in a lexicon, wherein each position-dependent phonetic token comprises a phone and a position indicator that indicates the position of the phone within a syllable; and
  
  a processor determining probabilities for sub-sequences in the sequences of position-dependent phonetic tokens by applying sub-sequences of position-dependent phonetic tokens converted from the sequence of words to a position-dependent phonetic language model that describes probabilities of sequences of position-dependent phonetic tokens comprising a conditional probability of a position-dependent phonetic token given at least two preceding position-dependent phonetic tokens and probabilities of individual position-dependent phonetic tokens.
- View Dependent Claims (2, 3)
- - 2. The method of claim 1 further comprising utilizing the at least one probability as a confidence measure.
  - 3. The method of claim 1 wherein at least one position indicator indicates one of a group of syllable positions consisting of onset consonant, and coda consonant.

4. A hardware computer storage medium encoded with a computer program, causing the computer to execute steps comprising:
- decoding a representation of a speech signal using a position-dependent phonetic language model that provides probabilities of sequences of position-dependent phonetic tokens wherein each position-dependent phonetic token comprises a phone and a position identifier that identifies a position within a syllable, wherein decoding produces at least one sequence of position-dependent phonetic tokens;
  
  receiving a symbol represented by a portion of the speech signal, wherein the symbol is not part of the language of a lexicon; and
  
  annotating a lexicon by storing a sequence of position-dependent phonetic tokens as a pronunciation for the symbol that is not part of the language of the lexicon, wherein annotating the lexicon further comprises identifying syllable boundaries from the position-dependent phonetic tokens and placing syllable boundaries in the lexicon.
- View Dependent Claims (5)
- - 5. The hardware computer storage medium of claim 4 wherein a position identifier is one of a group of position identifiers consisting of onset consonant and coda consonant.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Liu, Peng, Shi, Yu, Soong, Frank Kao-ping
Primary Examiner(s)
Vo, Huyen X.

Application Number

US11/652,451
Publication Number

US 20080172224A1
Time in Patent Office

1,888 Days
Field of Search

704/254, 704/255, 704/220, 704/1, 704 3- 4, 704 6- 7, 704 9- 10, 704/251, 704/231, 704/243, 704/244, 704/270, 704/242, 704/256.4
US Class Current

704/255
CPC Class Codes

G10L 15/183 using context dependencies,...

G10L 15/187 Phonemic context, e.g. pron...

Position-dependent phonetic models for reliable pronunciation identification

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

30 Citations

5 Claims

Specification

Solutions

Use Cases

Quick Links

Position-dependent phonetic models for reliable pronunciation identification

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

30 Citations

5 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links