Method and apparatus for automatic speech segmentation into phoneme-like units for use in speech processing applications, and based on segmentation into broad phonetic classes, sequence-constrained vector quantization and hidden-markov-models

US 6,208,967 B1
Filed: 02/25/1997
Issued: 03/27/2001
Est. Priority Date: 02/27/1996
Status: Expired due to Fees

First Claim

Patent Images

1. A method for automatically segmenting speech for use in speech processing applications, said method comprising the steps of:

classifying and segmenting utterances from a speech data base into three broad phonetic classes (BPC) voiced, unvoiced, and silence, for attaining preliminary segmentation positions;

using preliminary segmentation positions as anchor points for further segmentation into phoneme-like units by sequence-constrained vector quantization (SCVQ) in an SCVQ-step;

initializing phoneme Hidden-Markov-Models with the segments provided by the SCVQ-step, and further tuning of the HMM parameters by Baum-Welch estimation;

finally, using the fully trained HMMs to perform Viterbi alignment of the utterances with respect to their phonetic transcription and in this way obtaining the final segmentation points.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

For machine segmenting of speech, first utterances from a database of known spoken words are classified and segmented into three broad phonetic classes (BPC) voiced, unvoiced, and silence. Next, using preliminary segmentation positions as anchor points, sequence-constrained vector quantization is used for further segmentation into phoneme-like units. Finally, exact tuning to the segmented phonemes is done through Hidden-Markov Modelling and after training a diphone set is composed for further usage.

194 Citations

6 Claims

1. A method for automatically segmenting speech for use in speech processing applications, said method comprising the steps of:
- classifying and segmenting utterances from a speech data base into three broad phonetic classes (BPC) voiced, unvoiced, and silence, for attaining preliminary segmentation positions;
  
  using preliminary segmentation positions as anchor points for further segmentation into phoneme-like units by sequence-constrained vector quantization (SCVQ) in an SCVQ-step;
  
  initializing phoneme Hidden-Markov-Models with the segments provided by the SCVQ-step, and further tuning of the HMM parameters by Baum-Welch estimation;
  
  finally, using the fully trained HMMs to perform Viterbi alignment of the utterances with respect to their phonetic transcription and in this way obtaining the final segmentation points.
- View Dependent Claims (2, 3)
- - 2. A method as claimed in claim 1, further including the step of composing a diphone set after obtaining the final segmentation points.
  - 3. A method as claimed in claim 1, wherein said speech processing is speech synthesis.

4. An apparatus for segmenting speech for use in speech processing applications, said apparatus comprising:
- BPC segmenting means fed by a speech data base for classifying and segmenting utterances received into three broad phonetic classes (BPC) voiced, unvoiced, and silence, SCVQ segmenting means fed by said BPC segmenting means for by using preliminary segmentation positions as anchor points executing further segmentation into phoneme-like units by sequence-constrained vector quantization (SCVQ), phone Hidden-Markov-Means (HMM) fed by said SCVQ segmenting means for initialization of phoneme HMM and further tuning of HMM parameters;
  
  final segmentation means controlled by said HMM.
- View Dependent Claims (5, 6)
- - 5. An apparatus as claimed in claim 4, comprising diphone generating means fed by said segmentation means for composing a diphone set.
  - 6. An apparatus as claimed in claim 4, furthermore comprising an output control stage for controlling a speech synthesis output stage through an intermediate storage stage between a tuning means and said speech synthesis output stage.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
US Philips Corporation (Koninklijke Philips N.V.)
Original Assignee
US Philips Corporation (Koninklijke Philips N.V.)
Inventors
Willems, Leonardus F. W., Pauws, Stefan C., Kamp, Yves G. C.
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US08/806,873
Time in Patent Office

1,491 Days
Field of Search

704/258, 704/256, 704/255, 704/253, 704/241, 704/242, 704/243, 704/244, 704/245
US Class Current

704/256.8
CPC Class Codes

G10L 15/04 Segmentation; Word boundary...

G10L 15/142 Hidden Markov Models [HMMs]

Method and apparatus for automatic speech segmentation into phoneme-like units for use in speech processing applications, and based on segmentation into broad phonetic classes, sequence-constrained vector quantization and hidden-markov-models

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

194 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for automatic speech segmentation into phoneme-like units for use in speech processing applications, and based on segmentation into broad phonetic classes, sequence-constrained vector quantization and hidden-markov-models

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

194 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links