Intermediate scoring and rejection loopback for improved key phrase detection
First Claim
Patent Images
1. A computer-implemented method for key phrase detection comprising:
- updating, at a current time instance, a start state based rejection model and a key phrase model associated with a predetermined key phrase based on scores of sub-phonetic units representative of received audio input, wherein the start state based rejection model includes a single rejection state having a plurality of rejection model self loops each associated with a particular score of the scores of sub-phonetic units, wherein the key phrase model includes a plurality of key phrase states interconnected by transitions therebetween, wherein the start state based rejection model and the key phrase model are connected by a first transition from the single rejection state to a first key phrase state of the plurality of key phrase states, and wherein said updating comprises;
transitioning a score from a particular key phrase state of the plurality of key phrase states of the key phrase model to a next key phrase state of the plurality of key phrase states of the key phrase model;
transitioning the score from the particular key phrase state to the single rejection state of the start state based rejection model; and
generating a rejection likelihood score corresponding to the single rejection state of the start state based rejection model and a key phrase likelihood score corresponding to the key phrase model; and
detecting the predetermined key phrase in the received audio input based on the rejection likelihood score and the key phrase likelihood score; and
providing a wake indicator or a command in response to the detected predetermined key phrase.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques related to key phrase detection for applications such as wake on voice are discussed. Such techniques may include intermediate scoring of a state or states of a key phrase model and/or a backward transition or rejection loopback from a state of the key phrase model to a rejection model to reduce false accepts based on received utterances.
56 Citations
24 Claims
-
1. A computer-implemented method for key phrase detection comprising:
-
updating, at a current time instance, a start state based rejection model and a key phrase model associated with a predetermined key phrase based on scores of sub-phonetic units representative of received audio input, wherein the start state based rejection model includes a single rejection state having a plurality of rejection model self loops each associated with a particular score of the scores of sub-phonetic units, wherein the key phrase model includes a plurality of key phrase states interconnected by transitions therebetween, wherein the start state based rejection model and the key phrase model are connected by a first transition from the single rejection state to a first key phrase state of the plurality of key phrase states, and wherein said updating comprises; transitioning a score from a particular key phrase state of the plurality of key phrase states of the key phrase model to a next key phrase state of the plurality of key phrase states of the key phrase model; transitioning the score from the particular key phrase state to the single rejection state of the start state based rejection model; and generating a rejection likelihood score corresponding to the single rejection state of the start state based rejection model and a key phrase likelihood score corresponding to the key phrase model; and detecting the predetermined key phrase in the received audio input based on the rejection likelihood score and the key phrase likelihood score; and providing a wake indicator or a command in response to the detected predetermined key phrase. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-implemented method for key phrase detection comprising:
-
updating a start state based rejection model and a key phrase model associated with a predetermined key phrase based on scores of sub-phonetic units representative of received audio input, wherein the start state based rejection model includes a single rejection state having a plurality of rejection model self loops each associated with a particular score of the scores of sub-phonetic units and wherein the key phrase model includes a plurality of key phrase states interconnected by transitions therebetween, the start state based rejection model and the key phrase model being connected by a first transition from the single rejection state to a first key phrase state of the plurality of key phrase states; determining a rejection likelihood score based on the single rejection state of the updated start state based rejection model; determining an overall key phrase likelihood score comprising a minimum of only a subset of likelihood scores associated with a corresponding subset of key phrase states of the key phrase model including at least a first likelihood score associated with a first key phrase state corresponding to an end of a first portion of the key phrase and a second likelihood score associated with a final key phrase state of the key phrase model corresponding to an end of a second portion of the key phrase; and detecting the predetermined key phrase in the received audio input based on the rejection likelihood score and the key phrase likelihood score; and providing a wake indicator or a command in response to the detected predetermined key phrase. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A system for performing key phrase detection comprising:
-
a memory configured to store a start state based rejection model and a key phrase model associated with a predetermined key phrase; a digital signal processor coupled to the memory, the digital signal processor to update, at a current time instance, the start state based rejection model and the key phrase model based on scores of sub-phonetic units representative of received audio input, wherein the start state based rejection model includes a single rejection state having a plurality of rejection model self loops each associated with a particular score of the scores of sub-phonetic units, wherein the key phrase model includes a plurality of key phrase states interconnected by transitions therebetween, wherein the start state based rejection model and the key phrase model are connected by a first transition from the single rejection state to a first key phrase state of the plurality of key phrase states, and wherein to update the start state based rejection model and the key phrase model, the digital signal processor is to transition a score from a particular key phrase state of the plurality of key phrase states of the key phrase model to a next key phrase state of the plurality of key phrase states of the key phrase model, to transition the score from the particular key phrase state to the single rejection state of the start state based rejection model, and to generate a rejection likelihood score corresponding to the single rejection state of the start state based rejection model and a key phrase likelihood score corresponding to the key phrase model; and detect the predetermined key phrase in the received audio input based on the rejection likelihood score and the key phrase likelihood score; and provide a wake indicator or a command in response to the detected predetermined key phrase. - View Dependent Claims (17, 18, 19, 20)
-
-
21. A system for performing key phrase detection comprising:
-
a memory configured to store a start state based rejection model and a key phrase model associated with a predetermined key phrase; and a digital signal processor coupled to the memory, the digital signal processor to update the start state based rejection model and the key phrase model based on scores of sub-phonetic units representative of received audio input, wherein the start state based rejection model includes a single rejection state having a plurality of rejection model self loops each associated with a particular score of the scores of sub-phonetic units and wherein the key phrase model includes a plurality of key phrase states interconnected by transitions therebetween, the start state based rejection model and the key phrase model being connected by a first transition from the single rejection state to a first key phrase state of the plurality of key phrase states, to determine a rejection likelihood score based on the single rejection state of the updated start state based rejection model, to determine an overall key phrase likelihood score comprising a minimum of only a subset of likelihood scores associated with a corresponding subset of key phrase states of the key phrase model including at least a first likelihood score associated with a first key phrase state corresponding to an end of a first portion of the key phrase and a second likelihood score associated with a final key phrase state of the key phrase model corresponding to an end of a second portion of the key phrase; detect the predetermined key phrase in the received audio input based on the rejection likelihood score and the key phrase likelihood score; and provide a wake indicator or a command in response to the detected predetermined key phrase. - View Dependent Claims (22, 23, 24)
-
Specification