Energy-Efficient Unobtrusive Identification of a Speaker
First Claim
1. A method for identifying a speaker in an energy-efficient manner, comprising:
- at a first processing unit;
determining whether a received audio signal satisfies an audio strength test; and
if the audio strength test is satisfied, determining whether the audio signal contains human speech; and
at a second processing unit;
identifying whether speech frames, obtained from the audio signal, have a quality which satisfies an admission test, the speech frames which satisfy the admission test comprising admitted frames; and
associating a speaker model with the admitted frames, the speaker model corresponding to a particular user,operations performed by the first processing unit having a first power expenditure and operations performed by the second processing unit having a second power expenditure, the first power expenditure being less than the second power expenditure.
2 Assignments
0 Petitions
Accused Products
Abstract
Functionality is described herein for recognizing speakers in an energy-efficient manner. The functionality employs a heterogeneous architecture that comprises at least a first processing unit and a second processing unit. The first processing unit handles a first set of audio processing tasks (associated with the detection of speech) while the second processing unit handles a second set of audio processing tasks (associated with the identification of speakers), where the first set of tasks consumes less power than the second set of tasks. The functionality also provides unobtrusive techniques for collecting audio segments for training purposes. The functionality also encompasses new applications which may be invoked in response to the recognition of speakers.
246 Citations
20 Claims
-
1. A method for identifying a speaker in an energy-efficient manner, comprising:
-
at a first processing unit; determining whether a received audio signal satisfies an audio strength test; and if the audio strength test is satisfied, determining whether the audio signal contains human speech; and at a second processing unit; identifying whether speech frames, obtained from the audio signal, have a quality which satisfies an admission test, the speech frames which satisfy the admission test comprising admitted frames; and associating a speaker model with the admitted frames, the speaker model corresponding to a particular user, operations performed by the first processing unit having a first power expenditure and operations performed by the second processing unit having a second power expenditure, the first power expenditure being less than the second power expenditure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A device for determining the identities of users, comprising:
-
a first processing unit including logic for determining whether an audio signal contains human speech; and a second processing unit including logic for identifying, when invoked by the first processing unit, a speaker who is associated with the audio signal by comparing the audio signal with one or more speaker models, operations performed by the first processing unit being performed on a more frequent basis than operations performed by the second processing unit, and the operations performed by the first processing unit having a first power expenditure and the operations performed by the second processing unit having a second power expenditure, the first power expenditure being less than the second power expenditure. - View Dependent Claims (16)
-
-
17. A method, implemented by computing functionality, for generating one or more speaker models, comprising:
-
obtaining an audio segment in an unobtrusive manner from a first participant, the audio segment capturing a conversation between the first participant and a second participant; generating a speaker model for at least the second participant based on the audio segment; and downloading the speaker model of said at least the second participant to a user device associated with the first participant, the user device including functionality for using the speaker model to identify the second participant using a heterogeneous energy-efficient processing architecture. - View Dependent Claims (18, 19, 20)
-
Specification