Method for speech analysis and speech recognition
First Claim
1. A method of speech analysis comprising:
- representing a length of speech as a temporal sequence of frames, with each frame representing speech sounds at one of a succession of brief time periods;
analyzing each frame of speech to obtain a plurality of spectral parameters, each of which represents the energy at one of a series of different frequency bands;
finding the difference between the energy of a given spectral parameter of a given frame and the energy, in a nearby frame, of a spectral parameter associated with an energy band which is close to, but different than, the frequency band represented by said given spectral parameter; and
using that difference to calculate a slope parameter which provides an indication of the extent to which the frequency of the acoustic energy in the part of the spectrum represented by said given spectral parameter is going up or going down.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of speech analysis calculates one or more difference parameters for each of a sequence of acoustic frames, where each difference parameter is a function of the difference between an acoustic parameter in one frame and an acoustic parameter in a nearby frame. The method is used in speech recognition which compares the difference parameters of each frame against acoustic models representing speech units, where each speech-unit model has a model of the difference parameters associated with the frames of its speech unit. The difference parameters can be slope parameters or energy difference parameters. Slope parameters are derived by finding the difference between the energy of a given spectral parameter of a given frame and the energy, in a nearby frame, of a spectral parameter associated with a different frequency band. The resulting parameter indicates the extent to which the frequency of energy in the part of the spectrum represented by the given parameter is going up or going down. Energy difference parameters are calculated as a function of the difference between a given spectral parameter in one frame and a spectral parameter in a nearby frame representing the same frequency band. In one embodiment of the invention, dynamic programming compares the difference parameters of a sequence of frames to be recognized against a sequence of dynamic programming elements associated with each of a plurality of speech-unit models. In another embodiment of the invention, each speech-unit model represents one phoneme, and the speech-unit models for a plurality of phonemes are compared against individual frames, to associate with each such frame the one or more phonemes whose models compare most closely with it.
-
Citations
18 Claims
-
1. A method of speech analysis comprising:
-
representing a length of speech as a temporal sequence of frames, with each frame representing speech sounds at one of a succession of brief time periods; analyzing each frame of speech to obtain a plurality of spectral parameters, each of which represents the energy at one of a series of different frequency bands; finding the difference between the energy of a given spectral parameter of a given frame and the energy, in a nearby frame, of a spectral parameter associated with an energy band which is close to, but different than, the frequency band represented by said given spectral parameter; and using that difference to calculate a slope parameter which provides an indication of the extent to which the frequency of the acoustic energy in the part of the spectrum represented by said given spectral parameter is going up or going down. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method of speech recognition comprising:
-
representing a length of speech as a temporal sequence of frames, with each frame representing speech sounds at one of a succession of brief time periods, and with each frame containing a plurality of acoustic parameters; calculating one or more difference parameters in association with each of a plurality of said frames, with each difference parameter being calculated as a function of the difference between a first acoustic parameter in one such frame and a second acoustic parameter in a nearby frame, in which for one or more of said difference parameters, said first parameter and said second parameter are associated with different frequencies; and comparing the difference parameters associated with individual frames against each of a plurality of acoustic models representing speech units, where each such speech-unit model has a model for the difference parameters associated with frames that correspond to the speech unit it represents. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification