Speech recognition

US 6,804,643 B1
Filed: 10/27/2000
Issued: 10/12/2004
Est. Priority Date: 10/29/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition feature extractor for extracting speech features from a speech signal, comprising:

a time-to-frequency domain transformer for generating spectral magnitude values in the frequency domain from the speech signal;

a frequency domain filtering block for generating a sub-band value relating to spectral magnitude values of a certain frequency sub-band, for each of a group of frequency sub-bands;

a compression block for compressing said sub-band values;

a transformat on block for obtaining a set of de-correlated features from the sub-band values; and

a normalising block for normalizing features;

said feature extractor comprising a mean emphasising block for emphasizing at least one of the sub-band values after frequency domain filtering, wherein the emphasising is accomplished by addition of a mean value of sub-band signals to said at least one of the sub-band values.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition feature extractor for extracting speech features from a speech signal, comprising: a time-to-frequency domain transformer (FFT) for generating spectral magnitude values in the frequency domain from the speech signal; a frequency domain filtering block (Mel) for generating a sub-band value relating to spectral magnitude values of a certain frequency sub-band; a compression block (LOG) for compressing said sub-band values; a transformation block (DCT) for obtaining a set of de-correlated features from the compressed sub-band values; and normalising block (CN) for normalising de-correlated features.

Citations

10 Claims

1. A speech recognition feature extractor for extracting speech features from a speech signal, comprising:
- a time-to-frequency domain transformer for generating spectral magnitude values in the frequency domain from the speech signal;
  
  a frequency domain filtering block for generating a sub-band value relating to spectral magnitude values of a certain frequency sub-band, for each of a group of frequency sub-bands;
  
  a compression block for compressing said sub-band values;
  
  a transformat on block for obtaining a set of de-correlated features from the sub-band values; and
  
  a normalising block for normalizing features;
  
  said feature extractor comprising a mean emphasising block for emphasizing at least one of the sub-band values after frequency domain filtering, wherein the emphasising is accomplished by addition of a mean value of sub-band signals to said at least one of the sub-band values.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. A speech recognition feature extractor according to claim 1, wherein the mean emphasising block is arranged to mean emphasise all the sub-band values.
  - 3. A speech recognition feature extractor according to claim 1, wherein the mean emphasising block is arranged to mean emphasise some of the sub-band values.
  - 4. A speech recognition feature extractor according to claim 1, wherein said frequency domain filtering block is arranged to generate sub-band values according to a scale based on an auditory model.
  - 5. A speech recognition feature extractor according to claim 1, comprising a differentiation block for generation of first time derivatives and second time derivatives for each of said de-correlated features;
    - and wherein
6. A speech recognition feature extractor according to claim 1, wherein, in said addition of the mean value, said mean emphasising block is arranged to add a mean estimate term to each sub-band value that is to be mean emphasised.
7. A speech recognition feature extractor according to claim 6, wherein the mean emphasising block is arranged to calculate the mean estimate term from compressed sub-band values representing a series of at least two subsequent speech frames.

8. A mobile station comprising a speech recognition feature extractor for extracting speech features from a speech signal, said extractor comprising:
- a time-to-frequency domain transformer for generating spectral magnitude values in the frequency domain from the speech signal;
  
  a frequency domain filtering block for generating a sub-band value relating to spectral magnitude values of a certain frequency sub-band, for each of a group of frequency sub-bands;
  
  a compression block for compressing said sub-band values;
  
  a transformation block for obtaining a set of de-correlated features from the sub-band values; and
  
  a normalising block for normalising features;
  
  said feature extractor comprising a mean emphasising block for emphasising;
  
  at least one of the sub-band values after frequency domain filtering, wherein the emphasising is accomplished by addition of a mean value of sub-band signals to said at least one of the sub-band values.

9. A method for extracting speech features from a speech signal, comprising the steps of:
- generating spectral magnitude values in the frequency domain from the speech signal;
  
  generating a sub-band value relating to spectral magnitude, values of a certain frequency sub-band;
  
  compressing said sub-band values;
  
  obtaining a set of de-correlated features from the sub-band values;
  
  normalising features; and
  
  emphasising at least one of the sub-band values after frequency domain filtering, wherein the emphasising is accomplished by addition of a mean value of sub-band signals to said at least one of the sub-band values.

10. A computer program for extracting speech features from a speech signal, comprising:
- a computer readable program means for causing a computer to generate spectral magnitude values in the frequency domain from the speech signal;
  
  a computer readable program means for causing a computer to generate a sub-band value relating to spectral magnitude values of a certain frequency sub-band, for each of a group of frequency sub-bands;
  
  a computer readable program means for causing a computer to compress said sub-band values;
  
  a computer readable program means for causing a computer to obtain a skit of de-correlated features from the sub-band values;
  
  a computer readable program means for causing a computer to normalise features; and
  
  a computer readable mean-emphasising program means for causing a computer to emphasise at least one of the sub-band values after frequency domain filtering, wherein emphasising is accomplisher by addition of a mean value of sub-band signals to said at least one of the sub-band values.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nokia Mobile Phones UK Limited (Nokia Corporation)
Original Assignee
Nokia Mobile Phones UK Limited (Nokia Corporation)
Inventors
Kiss, Imre
Primary Examiner(s)
ABEBE, DANIEL DEMELASH

Application Number

US09/698,805
Time in Patent Office

1,446 Days
Field of Search

704/243, 704/234, 704/235, 704/246, 704/261
US Class Current

704/234
CPC Class Codes

G10L 15/02 Feature extraction for spee...

G10L 25/27 characterised by the analys...

Speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links