Voice labeling error detecting system, voice labeling error detecting method and program

US 7,454,347 B2
Filed: 08/18/2004
Issued: 11/18/2008
Est. Priority Date: 08/27/2003
Status: Active Grant

First Claim

Patent Images

1. A voice labeling error detecting system comprising:

data acquisition means for acquiring waveform data representing a waveform of a unit voice and labeling data for identifying a kind of said unit voice;

classification means for classifying the waveform data acquired by said data acquisition means into the kinds of unit voice, based on the labeling data acquired by said data acquisition means;

evaluation value decision means for specifying a frequency of a formant of each unit voice represented by the waveform data acquired by said data acquisition means and determining an evaluation value of said waveform data based on the specified frequency; and

error detection means for detecting the waveform data from among a set of waveform data classified into a same kind, for which a deviation of evaluation value within said set reaches a predetermined amount, and outputting the data representing said detected waveform data, as waveform data having a labeling error, andwherein said evaluation value H is calculated by the following formula representing a linear combination of values {|f(k)−

F(k)|};

$H = \sum_{k = 1}^{n} {\langle f (k) - F (k) \rangle \cdot W (k)}$ wherein F(k) is a frequency of the k-th formant of a unit voice indicated by the waveform data to calculate the evaluation value, and f(k) is an average value of the frequency of the k-th formant of the unit voice indicated by each waveform data classified into the same kind as said waveform data, W(k) is a weighting factor and n is the order of formant of the phoneme having the highest frequency.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A labeling part 3 analyzes the character string data to produce a phoneme label and a prosody label, partition the voice data stored in a voice database 1 into phonemic data, and label the phonemic data, employing the phoneme label and the like. A phoneme segmenting part 4 connects the voice data labeled with the same kind of phonemic data, and a formant extracting part 5 specifies the frequency of formant of each piece of phonemic data. A processing part 6 decides an evaluation value for each phonemic data based on the frequency of formant, and an error detection part 7 detects the phonemic data of which a deviation of the evaluation value within a set of phonemic data reaches a predetermined amount.

Citations

7 Claims

1. A voice labeling error detecting system comprising:
- data acquisition means for acquiring waveform data representing a waveform of a unit voice and labeling data for identifying a kind of said unit voice;
  
  classification means for classifying the waveform data acquired by said data acquisition means into the kinds of unit voice, based on the labeling data acquired by said data acquisition means;
  
  evaluation value decision means for specifying a frequency of a formant of each unit voice represented by the waveform data acquired by said data acquisition means and determining an evaluation value of said waveform data based on the specified frequency; and
  
  error detection means for detecting the waveform data from among a set of waveform data classified into a same kind, for which a deviation of evaluation value within said set reaches a predetermined amount, and outputting the data representing said detected waveform data, as waveform data having a labeling error, andwherein said evaluation value H is calculated by the following formula representing a linear combination of values {|f(k)−
  
  F(k)|};
  
  $H = \sum_{k = 1}^{n} {\langle f (k) - F (k) \rangle \cdot W (k)}$ wherein F(k) is a frequency of the k-th formant of a unit voice indicated by the waveform data to calculate the evaluation value, and f(k) is an average value of the frequency of the k-th formant of the unit voice indicated by each waveform data classified into the same kind as said waveform data, W(k) is a weighting factor and n is the order of formant of the phoneme having the highest frequency.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The voice labeling error detecting system according to claim 1, characterized in that said evaluation value is a linear combination of plural frequencies of formants in a spectrum of acquired waveform data.
  - 3. The voice labeling error detecting system according to claim 1 or 2, characterized in that said evaluation value deciding means deals with a frequency at a maximal value of a spectrum in the waveform data as the frequency of formant of unit voice indicated by said waveform data.
  - 4. The voice labeling error detecting system according to any one of claim 1 or 2, characterized in that said evaluation value deciding means specifies an order of formant used to decide the evaluation value of the waveform data as the kind of unit voice indicated by said waveform data, corresponding to the kind of labeling data.
  - 5. The voice labeling error detecting system according to any one of claim 1 or 2, characterized in that said error detection means detects the waveform data associated with the labeling data indicating a voiceless state at which a magnitude of voice represented by said waveform data reaches a predetermined amount as the waveform data in which the labeling has an error.
  - 6. The voice labeling error detecting system according to claim 1 or 2, characterized in that said classification means comprises means for concatenating each waveform data classified into the same kind in the form in which two adjacent pieces of waveform data sandwiches data indicating a voiceless state therebetween.

7. A voice labeling error detecting method comprising the steps of:
- acquiring waveform data representing a waveform of a unit voice and labeling data for identifying a kind of said unit voice;
  
  classifying said acquired waveform data into the kinds of unit voice, based on said acquired labeling data;
  
  specifying a frequency of a formant of each unit voice represented by the waveform data and deciding an evaluation value of said waveform data based on the specified frequency; and
  
  detecting the waveform data having a labeling error, from among a set of waveform data classified into a same kind, in which a deviation of evaluation value within said set reaches a predetermined amount and outputting data representing said detected waveform data,wherein said evaluation value H is calculated by the following formula representing a linear combination of values {|f(k)−
  
  F(k)|};
  
  $H = \sum_{k = 1}^{n} {\langle f (k) - F (k) \rangle \cdot W (k)}$ wherein F(k) is a frequency of the k-th formant of a unit voice indicated by the waveform data to calculate the evaluation value, and f(k) is an average value of the frequency of the k-th formant of the unit voice indicated by each waveform data classified into the same kind as said waveform data, W(k) is a weighting factor and n is the order of formant of the phoneme having the highest frequency.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Rakuten Group, Inc.
Original Assignee
Kabushiki Kaisha Kenwood (JVC Kenwood Corporation)
Inventors
Koyama, Rika
Primary Examiner(s)
Wozniak; James S

Application Number

US10/920,454
Publication Number

US 20050060144A1
Time in Patent Office

1,553 Days
Field of Search

704/254, 704/258, 704/260, 704262-264, 704/268
US Class Current

704/268
CPC Class Codes

G10L 13/06 Elementary speech units use...

Voice labeling error detecting system, voice labeling error detecting method and program

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

7 Claims

Specification

Solutions

Use Cases

Quick Links

Voice labeling error detecting system, voice labeling error detecting method and program

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

7 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links