×

Speaker independent isolated word recognition system using neural networks

  • US 5,566,270 A
  • Filed: 05/05/1994
  • Issued: 10/15/1996
  • Est. Priority Date: 05/05/1993
  • Status: Expired due to Term
First Claim
Patent Images

1. A speaker independent isolated word recognition apparatus, comprising:

  • digitizing means for digitizing a speech signal and subjecting the digitized speech signal to spectral analysis at constant temporal intervals using fast Fourier transform, to obtain an analysis result;

    means connected to said digitizing means for subjecting the analysis result to an orthogonal transformation to obtain cepstral parameters and a logarithm of a total energy contained in each temporal interval to yield characteristic parameters of the speech signal for each temporal interval;

    means for detecting word ends through an energy level of the respective speech signal; and

    a recognizer (RNA), in which complete words are modelled with Markov model automata of a left-to-right type with recursion on states, each of which corresponds to an acoustic portion of the word, and in which the recognition is carried out through a dynamic programming according to a Viterbi algorithm on all automata for finding one with a minimum cost path, which corresponds to the recognized word indicated at output (PR), emission possibilities being calculate with a neural network with feedback having parallel processing neurons, the neural network being trained by;

    initialization;

    a. initialization of the neural network with small random synaptic weights;

    b. creation of a first segmentation by segmenting a training of set words uniformly;

    iteration by;

    initialization of the training set with all the segmented words;

    random choice of a word not already learned;

    updating of synaptic weights wij for a word by applying a correlative training by varying a neural network input according to a window sliding from left to right on the word and supplying for every input window a suitable objective vector at an output, constructed by setting a 1 on the neuron corresponding to a state to which the input window belongs, according to the segmentation, and by setting 0 on all the other neurons;

    segmentation recomputation for the considered word, by using the neural network as previously trained, and performing a dynamic programming only with correct model;

    updating of the segmentation St+1 ;

    if there still are non considered words in the training set, repeat the random choice;

    recomputation of transition probabilities of automata; and

    if the number of iterations on the training set is greater than a maximum preset number NMAX, terminate or return to initialization of the training set.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×