Speech encoder with features extracted from current and previous frames

US 5,787,389 A
Filed: 01/17/1996
Issued: 07/28/1998
Est. Priority Date: 01/17/1995
Status: Expired due to Term

First Claim

Patent Images

1. A speech signal encoder device comprising:

segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame period;

deciding means for using said original speech frames in deciding a predetermined number of modes of said original speech frames to produce decided mode results;

weighting means for perceptually weighting said original speech frames into weighted speech frames; and

encoding means for encoding said input speech signal into codes at said frame period and in response to said modes to produce said decided mode results and said codes as an encoder device output signal,wherein said deciding means makes use, in deciding a current mode of said modes for each current speech frame segmented from said input speech signal at said frame period, of feature quantities of at least one kind extracted from said current speech frame and a previous speech frame segmented at least one frame period prior to said current speech frame and of a previous mode decided at least one frame period prior to said current mode.wherein said deciding means uses said weighted speech frames in deciding said modes,wherein said feature quantities are rates of variation with time in said feature quantities,said speech signal encoder device further comprising;

means for extracting each of primary quantities of said feature quantities from said current speech frame,wherein said deciding means comprises;

means for extracting said rates of variation from said current and said previous speech frames as secondary quantities of said feature quantities; and

mode deciding means for deciding said current mode in response to said primary and said secondary quantities and said previous mode.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In a speech signal encoder device comprising a frame divider (31) for producing original speech frames, a mode decision circuit (49) decides a predetermined number of modes by using feature quantities which are extracted from each current speech frame segmented from an input speech signal at a predetermined frame period of as short as 5 ms and from a previous speech frame segmented at least one frame period prior to the current speech frame. Preferably, a weighing circuit (47) provides the current speech frame by perceptually weighting the original speech frames into weighed speech frames. It is possible to provide the feature quantities by a primary quantity and as a secondary quantity by a rate of variation in the primary quantity. Each feature quantity is preferably adjusted into an adjusted quantity in response to each current mode decided by using the current speech frame and a previous mode decided at least one frame period prior to the current mode. Each feature quantity may be a pitch prediction gain, a short-period predicted gain, a level, or a pitch of each original speech frame.

Citations

10 Claims

1. A speech signal encoder device comprising:
- segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame period;
  
  deciding means for using said original speech frames in deciding a predetermined number of modes of said original speech frames to produce decided mode results;
  
  weighting means for perceptually weighting said original speech frames into weighted speech frames; and
  
  encoding means for encoding said input speech signal into codes at said frame period and in response to said modes to produce said decided mode results and said codes as an encoder device output signal,wherein said deciding means makes use, in deciding a current mode of said modes for each current speech frame segmented from said input speech signal at said frame period, of feature quantities of at least one kind extracted from said current speech frame and a previous speech frame segmented at least one frame period prior to said current speech frame and of a previous mode decided at least one frame period prior to said current mode.wherein said deciding means uses said weighted speech frames in deciding said modes,wherein said feature quantities are rates of variation with time in said feature quantities,said speech signal encoder device further comprising;
  
  means for extracting each of primary quantities of said feature quantities from said current speech frame,wherein said deciding means comprises;
  
  means for extracting said rates of variation from said current and said previous speech frames as secondary quantities of said feature quantities; and
  
  mode deciding means for deciding said current mode in response to said primary and said secondary quantities and said previous mode.
- View Dependent Claims (2)
- - 2. A speech signal encoder device as claimed in claim 1, wherein:
    - said mode deciding means adjusts said current mode into an adjusted mode in response to said primary and said secondary quantities and said previous mode;
      
      said encoding means using, as said modes, adjusted modes produced by said mode deciding means for said input speech signal.

3. A speech signal encoder device comprising segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame period, deciding means for using said original speech frames in deciding a predetermined number of modes of said original speech frames to produce decided mode results, extracting means for extracting pitches from said input speech signal, and encoding means for encoding said input speech signal into codes at said frame period and in response to said modes to produce said decided mode results and said codes as an encoder device output signal, wherein:
- said extracting means comprises;
  
  feature quantity extracting means for extracting feature quantities by using at least each current speech frame segmented from said input speech signal at said frame period; and
  
  feature quantity adjusting means for using said feature quantities as said pitches to adjust said pitches into adjusted pitches in response to each current mode decided for said current speech frame and a previous mode decided at least one frame period prior to said current mode;
  
  said encoding means encoding said input speech signal into said codes in response further to said adjusted pitches.
- View Dependent Claims (4, 5, 6)
- - 4. A speech signal encoder device as claimed in claim 3, further comprising weighting means for perceptually weighting said original speech frames into weighted speech frames, wherein said deciding means uses said weighted speech frames in deciding said modes.
  - 5. A speech signal encoder device as claimed in claim 3, wherein said feature quantity extracting means extracts said pitches in response to said current speech frame and rates of variation with time in said pitches in response to said current speech frame and a previous speech frame segmented at least one frame period prior to said current speech frame.
  - 6. A speech signal encoder device as claimed in claim 3, wherein each of said feature quantities is one of a pitch prediction gain, a short-period predicted gain, a level, and a pitch of said current speech frame.

7. A speech signal encoder device comprising segmenting means for segmenting an input speech signal into original speech frames at a predetermined frame period, deciding means for using said original speech frames in deciding a predetermined number of modes of said original speech frames to produce decided mode results, extracting means for extracting levels from said input speech signal, and encoding means for encoding said input speech signal into codes at said frame period and in response to said modes to produce said decided mode results and said codes as an encoder device output signal, wherein:
- said extracting means comprises;
  
  feature quantity extracting means for extracting feature quantities by using at least each current speech frame segmented from said input speech signal at said frame period; and
  
  feature quantity adjusting means for using said feature quantities as said levels to adjust said levels into adjusted levels in response to each current mode decided for said current speech frame and a previous mode decided at least one frame period prior to said current mode;
  
  said encoding means encoding said input speech signal into said codes in response further to said adjusted levels.
- View Dependent Claims (8, 9, 10)
- - 8. A speech signal encoder device as claimed in claim 7, further comprising weighting means for perceptually weighting said original speech frames into weighted speech frames, wherein said deciding means uses said weighted speech frames in deciding said modes.
  - 9. A speech signal encoder device as claimed in claim 8, wherein said feature quantity extracting means extracts said levels in response to said current speech frame and rates of variation with time in said levels in response to said current speech frame and a previous speech frame segmented at least one frame period prior to said current speech frame.
  - 10. A speech signal encoder device as claimed in claim 9, wherein each of said feature quantities is one of a pitch prediction gain, a short-period predicted gain, a level, and a pitch of said current speech frame.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Rakuten, Inc. (Rakuten Group, Inc.)
Original Assignee
NEC Corporation
Inventors
Ozawa, Kazunori, Taumi, Shin-Ichi
Primary Examiner(s)
Tung, Kee M.

Application Number

US08/588,005
Time in Patent Office

923 Days
Field of Search

395/2.16, 395/2.17, 395/2.28-2.34, 395/2.2, 395/2.23, 395/2.38, 395/2.39, 395/2.71-2.73, 395/2.91-2.95, 704/207, 704/208, 704/211, 704/214, 704/219-225, 704/229, 704/230, 704/262-264, 704/500-504
US Class Current

704/219
CPC Class Codes

G10L 19/0018   Speech coding using phoneti...

G10L 19/06   Determination or coding of ...

G10L 19/08   Determination or coding of ...

G10L 19/18   Vocoders using multiple modes

G10L 2025/906   Pitch tracking

Speech encoder with features extracted from current and previous frames

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Speech encoder with features extracted from current and previous frames

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links