Speech encoder with features extracted from current and previous frames
First Claim
1. A speech signal encoder device comprising:
- segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame period;
deciding means for using said original speech frames in deciding a predetermined number of modes of said original speech frames to produce decided mode results;
weighting means for perceptually weighting said original speech frames into weighted speech frames; and
encoding means for encoding said input speech signal into codes at said frame period and in response to said modes to produce said decided mode results and said codes as an encoder device output signal,wherein said deciding means makes use, in deciding a current mode of said modes for each current speech frame segmented from said input speech signal at said frame period, of feature quantities of at least one kind extracted from said current speech frame and a previous speech frame segmented at least one frame period prior to said current speech frame and of a previous mode decided at least one frame period prior to said current mode.wherein said deciding means uses said weighted speech frames in deciding said modes,wherein said feature quantities are rates of variation with time in said feature quantities,said speech signal encoder device further comprising;
means for extracting each of primary quantities of said feature quantities from said current speech frame,wherein said deciding means comprises;
means for extracting said rates of variation from said current and said previous speech frames as secondary quantities of said feature quantities; and
mode deciding means for deciding said current mode in response to said primary and said secondary quantities and said previous mode.
3 Assignments
0 Petitions
Accused Products
Abstract
In a speech signal encoder device comprising a frame divider (31) for producing original speech frames, a mode decision circuit (49) decides a predetermined number of modes by using feature quantities which are extracted from each current speech frame segmented from an input speech signal at a predetermined frame period of as short as 5 ms and from a previous speech frame segmented at least one frame period prior to the current speech frame. Preferably, a weighing circuit (47) provides the current speech frame by perceptually weighting the original speech frames into weighed speech frames. It is possible to provide the feature quantities by a primary quantity and as a secondary quantity by a rate of variation in the primary quantity. Each feature quantity is preferably adjusted into an adjusted quantity in response to each current mode decided by using the current speech frame and a previous mode decided at least one frame period prior to the current mode. Each feature quantity may be a pitch prediction gain, a short-period predicted gain, a level, or a pitch of each original speech frame.
-
Citations
10 Claims
-
1. A speech signal encoder device comprising:
-
segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame period; deciding means for using said original speech frames in deciding a predetermined number of modes of said original speech frames to produce decided mode results; weighting means for perceptually weighting said original speech frames into weighted speech frames; and encoding means for encoding said input speech signal into codes at said frame period and in response to said modes to produce said decided mode results and said codes as an encoder device output signal, wherein said deciding means makes use, in deciding a current mode of said modes for each current speech frame segmented from said input speech signal at said frame period, of feature quantities of at least one kind extracted from said current speech frame and a previous speech frame segmented at least one frame period prior to said current speech frame and of a previous mode decided at least one frame period prior to said current mode. wherein said deciding means uses said weighted speech frames in deciding said modes, wherein said feature quantities are rates of variation with time in said feature quantities, said speech signal encoder device further comprising; means for extracting each of primary quantities of said feature quantities from said current speech frame, wherein said deciding means comprises; means for extracting said rates of variation from said current and said previous speech frames as secondary quantities of said feature quantities; and mode deciding means for deciding said current mode in response to said primary and said secondary quantities and said previous mode. - View Dependent Claims (2)
-
-
3. A speech signal encoder device comprising segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame period, deciding means for using said original speech frames in deciding a predetermined number of modes of said original speech frames to produce decided mode results, extracting means for extracting pitches from said input speech signal, and encoding means for encoding said input speech signal into codes at said frame period and in response to said modes to produce said decided mode results and said codes as an encoder device output signal, wherein:
said extracting means comprises; feature quantity extracting means for extracting feature quantities by using at least each current speech frame segmented from said input speech signal at said frame period; and feature quantity adjusting means for using said feature quantities as said pitches to adjust said pitches into adjusted pitches in response to each current mode decided for said current speech frame and a previous mode decided at least one frame period prior to said current mode; said encoding means encoding said input speech signal into said codes in response further to said adjusted pitches. - View Dependent Claims (4, 5, 6)
-
7. A speech signal encoder device comprising segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame period, deciding means for using said original speech frames in deciding a predetermined number of modes of said original speech frames to produce decided mode results, extracting means for extracting levels from said input speech signal, and encoding means for encoding said input speech signal into codes at said frame period and in response to said modes to produce said decided mode results and said codes as an encoder device output signal, wherein:
said extracting means comprises; feature quantity extracting means for extracting feature quantities by using at least each current speech frame segmented from said input speech signal at said frame period; and feature quantity adjusting means for using said feature quantities as said levels to adjust said levels into adjusted levels in response to each current mode decided for said current speech frame and a previous mode decided at least one frame period prior to said current mode; said encoding means encoding said input speech signal into said codes in response further to said adjusted levels. - View Dependent Claims (8, 9, 10)
Specification