Method of speech recognition using multimodal variational inference with switching state space models
First Claim
1. A method of setting posterior probability means for posterior probability distributions in a switching state space model, the posterior probability providing the likelihood of a set of hidden states for a sequence of frames based upon input values associated with the sequence of frames, the method comprising:
- inputting a speech signalidentifying input values of a sequence of frames from the speech signaldefining a window containing at least two but fewer than all of the frames in the sequence of frames;
determining a separate posterior probability mean for each frame in the window each posterior probability mean providing a mean value for a continuous hidden state given at least an input value wherein determining a separate posterior probability mean for each frame further comprises determining a separate posterior probability mean;
for each of a set of discrete hidden states that are different from the continuous hidden states;
shifting the window so that it includes at least one subsequent frame in the sequence of frames to form a shifted window; and
determining a separate posterior probability mean for each frame in the shifted window; and
using the posterior probability means to decode a speech signal.
2 Assignments
0 Petitions
Accused Products
Abstract
A method of efficiently setting posterior probability parameters for a switching state space model begins by defining a window containing at least two but fewer than all of the frames. A separate posterior probability parameter is determined for each frame in the window. The window is then shifted sequentially from left to right in time so that it includes one or more subsequent frames in the sequence of frames. A separate posterior probability parameter is then determined for each frame in the shifted window. This method closely approximates a more rigorous solution but saves computational cost by two to three orders of magnitude. Further, a method of determining the optimal discrete state sequence in the switching state space model is invented that directly exploits the observation vector on a frame-by-frame basis and operates from left to right in time.
18 Citations
7 Claims
-
1. A method of setting posterior probability means for posterior probability distributions in a switching state space model, the posterior probability providing the likelihood of a set of hidden states for a sequence of frames based upon input values associated with the sequence of frames, the method comprising:
-
inputting a speech signal identifying input values of a sequence of frames from the speech signal defining a window containing at least two but fewer than all of the frames in the sequence of frames; determining a separate posterior probability mean for each frame in the window each posterior probability mean providing a mean value for a continuous hidden state given at least an input value wherein determining a separate posterior probability mean for each frame further comprises determining a separate posterior probability mean;
for each of a set of discrete hidden states that are different from the continuous hidden states;shifting the window so that it includes at least one subsequent frame in the sequence of frames to form a shifted window; and determining a separate posterior probability mean for each frame in the shifted window; and using the posterior probability means to decode a speech signal. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of decoding a speech signal to identify a sequence of phonetic units, the method comprising:
-
storing model parameters for a switching state space model in which there are discrete hidden states and continuous hidden states, the continuous hidden states being dependent on the discrete hidden states, converting the speech signal into a set of observation vectors, each observation vector associated with a separate frame of the speech signal; for each frame of the speech signal; determining a posterior probability mean for each discrete hidden state, the posterior probability mean defining a mean value for a continuos hidden state given a discrete hidden state and an observation vector wherein determining a posterior probability mean comprises defining a window of frames that contains fewer than all of the frames of the speech signal and determining a separate posterior probability mean for each discrete hidden state in each frame in the window by solving a set of simultaneous equations; and determining a path score for at least one path into each discrete hidden state in the frame based on the posterior probability mean for the respective discrete hidden state; and using the path score to select a single path into each discrete hidden state of the frame.
-
Specification