Pitch determination using speech classification and prior pitch estimation
DCFirst Claim
1. A speech encoding system for encoding a speech signal including a previous pitch lag and a current pitch lag, the speech encoding system comprising:
- an adaptive codebook for storing excitation vectors associated with corresponding pitch lag candidates; and
an encoder processing circuit for identifying the pitch lag candidates for at least one of a frame and a sub-frame of the speech signal;
the encoder processing circuit selecting a preferential one of the pitch lag candidates as the current pitch lag based on at least two of the following;
a first timing relationship, a second timing relationship, and voiced classification;
the first timing relationships concerning a temporal relationship between the previous pitch lag and at least one of the pitch lag candidates, the second timing relationship concerning a temporal relationship between at least two of the pitch lag candidates, the voiced classification pertaining to an interval of the speech signal.
9 Assignments
Litigations
0 Petitions
Accused Products
Abstract
A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. To achieve high quality in lower bit rate encoding modes, the speech encoder departs from the strict waveform matching criteria of regular CELP coders and strives to identify significant perceptual features of the input signal. To support lower bit rate encoding modes, a variety of techniques are applied many of which involve the classification of the input signal. For each bit rate mode selected, pluralities of fixed or innovation subcodebooks are selected for use in generating innovation vectors. The speech encoder also utilizes an adaptive weighting factor in the selection of a current pitch lag value from a plurality of pitch lag candidates. For example, if the speech encoder identifies an integer multiple timing relationship between any two pitch lag candidates, the pitch lag candidate with the smallest timing value is favored through adjustment of the weighting factor. Similarly, if a pitch lag candidate exhibits timing that corresponds to that of previous pitch lag values, the weighting factor is adjusted to favor that candidate.
207 Citations
37 Claims
-
1. A speech encoding system for encoding a speech signal including a previous pitch lag and a current pitch lag, the speech encoding system comprising:
-
an adaptive codebook for storing excitation vectors associated with corresponding pitch lag candidates; and
an encoder processing circuit for identifying the pitch lag candidates for at least one of a frame and a sub-frame of the speech signal;
the encoder processing circuit selecting a preferential one of the pitch lag candidates as the current pitch lag based on at least two of the following;
a first timing relationship, a second timing relationship, and voiced classification;
the first timing relationships concerning a temporal relationship between the previous pitch lag and at least one of the pitch lag candidates, the second timing relationship concerning a temporal relationship between at least two of the pitch lag candidates, the voiced classification pertaining to an interval of the speech signal.- View Dependent Claims (2, 3, 4, 5, 6, 7, 34)
-
-
8. A speech encoding system for encoding a speech signal that has a current pitch lag, the speech encoding system comprising:
-
an adaptive codebook;
an encoder processing circuit that identifies a plurality of pitch lag candidates; and
the encoder processing circuit applying an adaptive weighting factor to a pitch correlation to favor selection of at least one of the pitch lag candidates over at least one other of the pitch lag candidates if at least one of a first timing relationship and a second timing relationship is detected;
the first timing relationship associated with one of the pitch lag candidates and the second timing relationship being between at least two of the pitch lag candidates;
the encoder processing circuit selecting one of the pitch lag candidates as the current pitch lag by comparing the weighted pitch correlation to another pitch correlation.- View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 35, 36)
-
-
16. A method for speech encoding, the method comprising:
-
identifying a plurality of pitch lag candidates;
using an adaptive weighting factor applied to a pitch correlation to favor at least one of the pitch lag candidates over at least one other of the pitch lag candidates if at least one of a first timing relationship and a second timing relationship is detected;
the first timing relationship associated with one of the pitch lag candidates and the second timing relationship being between at least two of the pitch lag candidates; and
selecting one of the plurality of the pitch lag candidates as a current pitch lag estimate by comparing the weighted pitch correlation to another pitch correlation. - View Dependent Claims (17, 18, 19, 20, 37)
-
-
21. A method of encoding a speech signal, the method comprising the steps of:
-
identifying a plurality of pitch lag candidates for a present interval of the speech signal;
determining if a previous interval, with respect to the present interval, contains a voiced component;
comparing the identified pitch lag candidates to at least one previous pitch lag value for a previous interval;
to identify at least one favored one of the pitch lag candidates that falls within a temporal neighborhood of the previous pitch lag value if the previous interval contains a generally voiced component; and
favoring selection of the at least one favored one of the pitch lag candidates as a preferential one of the pitch lag candidates by weighting a pitch correlation for at least one favored candidate differently than a remainder of the pitch lag candidates. - View Dependent Claims (22, 23, 24, 25, 26)
comparing the identified pitch lag candidates to each other;
detecting a second timing relationship if the compared pitch lag candidates have pitch lags related approximately by an integer multiple of each other.
-
-
26. The method according to claim 25 further comprising the steps of:
favoring selection of the a second favored one of the pitch lag candidates with a second timing relationship as the preferential one of the pitch lag candidates by weighting the pitch correlation for the second favored one differently than a remainder of the pitch lag candidates.
-
27. A method of encoding a speech signal, the method comprising the steps of:
-
identifying a plurality of pitch lag candidates for a present interval of the speech signal;
determining if a previous interval, with respect to the present interval, contains a voiced component;
comparing identified pitch lag candidates to each other;
detecting a timing relationship if the compared pitch lag candidates have pitch lags related approximately by an integer multiple of each other; and
favoring selection of at least one favored one of the pitch lag candidates with the timing relationship as a preferential one of the pitch lag candidates by weighting a pitch correlation for the at least one favored candidate differently than a remainder of the pitch lag candidates.
-
-
28. A method of encoding a speech signal, the method comprising:
-
identifying a plurality of regions of the pitch lag;
determining a local maximum correlation between a target speech signal and a synthesized speech signal within each of the identified regions to provide a set of local maximum correlations; and
selecting a global maximum correlation among the determined local maximum correlations to facilitate selection of a pitch lag for a present interval of a speech signal. - View Dependent Claims (29, 30, 31, 32, 33)
comparing the selected global maximum correlation to local maximum correlations if the selected global maximum is outside of the first or predecessor region of the regions.
-
-
31. The method according to claim 30 further comprising:
applying weighting to pitch correlation values for candidate pitch lags based on a first timing relationship reflecting a neighborhood of a preferential candidate in relation to other candidate pitch lags associated with the regions prior to the comparing step.
-
32. The method according to claim 31 further comprising:
applying weighting to pitch correlation values for candidate pitch lags based on a second timing relationship, modifying the values of the determined local maximum correlations prior to the comparing step.
-
33. The method according to claim 31 further comprising:
applying weighting to the pitch correlation values for candidate pitch lags based on both a first timing relationship reflecting a selected candidate in relation to previous pitch lag values and a second relationship reflecting a selected candidate in relation to other candidate pitch lag values.
Specification