LEARNING SPEECH MODELS FOR MOBILE DEVICE USERS
First Claim
1. A method for training a user speech model, the method comprising:
- accessing audio data captured while a mobile device is in an in-call state;
clustering the captured audio data into a plurality of clusters, each cluster of the plurality of clusters being associated with one or more audio segments from the accessed audio data;
identifying a predominate voice cluster; and
training the user speech model based, at least in part, on audio data associated with the predominate voice cluster.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are provided to recognize a speaker'"'"'s voice. In one embodiment, received audio data may be separated into a plurality of signals. For each signal, the signal may be associated with value/s for one or more features (e.g., Mel-Frequency Cepstral coefficients). The received data may be clustered (e.g., by clustering features associated with the signals). A predominate voice cluster may be identified and associated with a user. A speech model (e.g., a Gaussian Mixture Model or Hidden Markov Model) may be trained based on data associated with the predominate cluster. A received audio signal may then be processed using the speech model to, e.g.: determine who was speaking; determine whether the user was speaking; determining whether anyone was speaking; and/or determine what words were said. A context of the device or the user may then be inferred based at least partly on the processed signal.
301 Citations
30 Claims
-
1. A method for training a user speech model, the method comprising:
-
accessing audio data captured while a mobile device is in an in-call state; clustering the captured audio data into a plurality of clusters, each cluster of the plurality of clusters being associated with one or more audio segments from the accessed audio data; identifying a predominate voice cluster; and training the user speech model based, at least in part, on audio data associated with the predominate voice cluster. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. An apparatus for training a user speech model, the apparatus comprising:
-
a mobile device comprising; a microphone configured to, upon being in an active state, receive audio signals and convert the received audio signals into radio signals; and a transmitter configured to transmit the radio signals; and one or more processors configured to; determine that the microphone is in the active state; capture audio data while the microphone is in the active state; cluster the captured audio data into a plurality of clusters, each cluster of the plurality of clusters being associated with one or more audio segments from the captured audio data; identify a predominate voice cluster; and train the user speech model based, at least in part, on audio data associated with the predominate voice cluster. - View Dependent Claims (19, 20, 21, 22)
-
-
23. A computer-readable medium containing a program which executes the steps of:
-
accessing audio data captured while a mobile device is in an in-call state; clustering the accessed audio data into a plurality of clusters, each cluster of the plurality of clusters being associated with one or more audio segments from the accessed audio data; identifying a predominate voice cluster; and training the user speech model based, at least in part, on audio data associated with the predominate voice cluster. - View Dependent Claims (24, 25, 26)
-
-
27. A system for training a user speech model, the system comprising:
-
means for accessing audio data captured while a mobile device is in an in-call state; means for clustering the accessed audio data into a plurality of clusters, each cluster of the plurality of clusters being associated with one or more audio segments from the accessed audio data; means for identifying a predominate voice cluster; and means for training the user speech model based, at least in part, on audio data associated with the predominate voice cluster. - View Dependent Claims (28, 29, 30)
-
Specification