×

Speech processing method and apparatus for deciding emphasized portions of speech, and program therefor

  • US 8,793,124 B2
  • Filed: 04/05/2006
  • Issued: 07/29/2014
  • Est. Priority Date: 08/08/2001
  • Status: Active Grant
First Claim
Patent Images

1. A speech processing method performed using a processor for deciding whether a portion of input speech is emphasized or not based on a set of speech parameters for each frame, comprising the steps of:

  • (a) obtaining from a codebook a plurality of speech parameter vectors each corresponding to a respective set of speech parameters obtained from respective ones of a plurality of frames in the portion of the input speech, said codebook storing, for each of a plural number of predetermined speech parameter vectors, a corresponding pair of a normal-state appearance probability and an emphasized-state appearance probability both predetermined using a training speech signal, each of said plural number of predetermined speech parameter vectors being composed of a set of speech parameters including at least one of a fundamental frequency, power and a temporal variation of dynamic-measure and/or an inter-frame difference in at least one of those speech parameters, and obtaining from said codebook a pair of an emphasized-state appearance probability and a normal-state appearance probability both corresponding to each speech parameter vector obtained for the respective ones of the plurality of frames in the portion of the input speech;

    (b) using the processor, calculating an emphasized-state likelihood of the portion of the input speech by multiplying together emphasized-state appearance probabilities corresponding to the respective speech parameter vectors for the plurality of frames in the portion of the input speech, and calculating a normal-state likelihood of the portion of the input speech by multiplying together normal-state appearance probabilities corresponding to the respective speech parameter vectors for the plurality of frames in the portion of the input speech; and

    (c) deciding whether the portion of the input speech is emphasized or not based on said calculated emphasized-state likelihood and said calculated normal-state likelihood, and outputting a decision result of said deciding, the decision result indicating whether the portion of the input speech is emphasized or not,wherein the codebook stores, for each of the plural predetermined speech parameter vectors, a respective independent emphasized-state appearance probability and a respective set of conditional emphasized-state appearance probabilities, both used as respective said emphasized-state appearance probability, and stores, for each of the plural predetermined speech parameter vectors, a respective independent normal-state appearance probability and a set of conditional normal-state appearance probabilities, both used as respective said normal-state appearance probability, such that there is at least stored a separate conditional emphasized-state appearance probability and a separate conditional normal-state appearance probability for a possible speech parameter vector that immediately follows the respective speech parameter vector in the codebook, andwherein the step of calculating the emphasized-state likelihood in said step (b) is implemented by multiplying together the independent emphasized-state appearance probability and the conditional emphasized-state appearance probabilities corresponding to the speech parameter vectors of respective first frame and subsequent frames in said portion of the input speech, and the step of calculating the normal-state likelihood in said step (b) is implemented by multiplying together the independent normal-state appearance probability and the conditional normal-state appearance probabilities corresponding to the speech parameter vectors of respective said first frame and said subsequent frames in said portion of the input speech.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×