Methods and systems for identifying keywords in speech signal
First Claim
1. A method of keyword recognition in a speech signal, the method comprising:
- sampling, by one or more processors, the speech signal in one or more frames;
determining, by the one or more processors, a first likelihood score of one or more features of a frame, of the one or more frames, of the speech signal being associated with one or more states in a first model, wherein the one or more states in the first model correspond to one or more tied triphone states of a keyword to be recognized in the speech signal, and wherein the one or more features comprise a frequency of an audio in the frame;
determining, by the one or more processors, a second likelihood score of the one or more features of the frame of the speech signal being associated with one or more states in a second model, wherein the one or more states in the second model correspond to one or more monophone states of the keyword to be recognized in the speech signal;
determining, by the one or more processors, a third likelihood score based on the first likelihood score and the second likelihood score, wherein the third likelihood score is deterministic of a likelihood of the frame corresponding to keywords other than the keyword; and
determining, by the one or more processors, a presence of the keyword in the speech signal based on the first likelihood score and the third likelihood score.
1 Assignment
0 Petitions
Accused Products
Abstract
The disclosed embodiments relate to a method of keyword recognition in a speech signal. The method includes determining a first likelihood score and a second likelihood score of one or more features of a frame of said speech signal being associated with one or more states in a first model and one or more states in a second model, respectively. The one or more states in the first model corresponds to one or more tied triphone states and the one or more states in the second model corresponds to one or more monophone states of a keyword to be recognized in the speech signal. The method further includes determining a third likelihood score based on the first likelihood score and the second likelihood score. The first likelihood score and the third likelihood score are utilizable to determine presence of the keyword in the speech signal.
-
Citations
15 Claims
-
1. A method of keyword recognition in a speech signal, the method comprising:
-
sampling, by one or more processors, the speech signal in one or more frames; determining, by the one or more processors, a first likelihood score of one or more features of a frame, of the one or more frames, of the speech signal being associated with one or more states in a first model, wherein the one or more states in the first model correspond to one or more tied triphone states of a keyword to be recognized in the speech signal, and wherein the one or more features comprise a frequency of an audio in the frame; determining, by the one or more processors, a second likelihood score of the one or more features of the frame of the speech signal being associated with one or more states in a second model, wherein the one or more states in the second model correspond to one or more monophone states of the keyword to be recognized in the speech signal; determining, by the one or more processors, a third likelihood score based on the first likelihood score and the second likelihood score, wherein the third likelihood score is deterministic of a likelihood of the frame corresponding to keywords other than the keyword; and determining, by the one or more processors, a presence of the keyword in the speech signal based on the first likelihood score and the third likelihood score. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system of keyword recognition in a speech signal, the system comprising:
one or more processors configured to; sample the speech signal in one or more frames; determine a first likelihood score of one or more features of a frame, of the one or more frames, of the speech signal being associated with one or more states in a first model, wherein the one or more states in the first model correspond to one or more tied triphone states of a keyword to be recognized in the speech signal, and wherein the one or more features comprise a frequency of an audio in the frame; determine a second likelihood score of the one or more features of the frame of the speech signal being associated with one or more states in a second model, wherein the one or more states in the second model correspond to one or more monophone states of the keyword to be recognized in the speech signal; determine a third likelihood score based on the first likelihood score and the second likelihood score, wherein the third likelihood score is deterministic of a likelihood of the frame corresponding to keywords other than the keyword; and determine a presence of the keyword in the speech signal based on the first likelihood score and the third likelihood score. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
15. A computer program product for use with a computer, the computer program product comprising a non-transitory computer readable medium, wherein the non-transitory computer readable medium stores a computer program code for keyword recognition in a speech signal, wherein the computer program code is executable by one or more processors to:
-
sample the speech signal in one or more frames; determine a first likelihood score of one or more features of a frame, of the one or more frames, of the speech signal being associated with one or more states in a first model, wherein the one or more states in the first model correspond to one or more tied triphone states of a keyword to be recognized in the speech signal, and wherein the one or more features comprise a frequency of an audio in the frame; determine a second likelihood score of the one or more features of the frame of the speech signal being associated with one or more states in a second model, wherein the one or more states in the second model correspond to one or more monophone states of the keyword to be recognized in the speech signal; determine a third likelihood score based on the first likelihood score and the second likelihood score, wherein the third likelihood score is deterministic of a likelihood of the frame corresponding to keywords other than the keyword; and determine a presence of the keyword in the speech signal based on the first likelihood score and the third likelihood score.
-
Specification