Real-time audio recording system for automatic speaker indexing
First Claim
1. A processor controlled system for electronically indexing audio data being recorded in real time, said audio data comprising speech from a plurality of individual speakers, the processor controlled system electronically indexing the audio data according to speaker, the system comprising:
- a training data source for providing spectral feature training data for each of the plurality of individual speakers;
a system processor for receiving the training data and producing a speaker model for each of the plurality of individual speakers, each speaker model having an associated speaker identifier, said system processor further combining said speaker models into a speaker network;
an audio input system for providing real time audio data comprising speech from the plurality of individual speakers;
an audio processor for receiving said audio data and converting said audio data into spectral feature data;
a recording device for receiving said audio data and recording said audio data on a storage medium according to a received time;
memory for storing data, the data stored in the memory including instruction data indicating instructions the processor executes;
said system processor further accessing the data stored in the memory;
said system processor, in executing the instructions, receiving said spectral feature data from said audio processor and, using said speaker network, determining segments of said audio data which correspond to different individual speaker models;
said system processor further determining at the start of each segment a timestamp, said timestamp corresponding to the received time for that segment on said storage medium, said system processor storing said timestamp in said memory;
said system processor further storing said speaker identifier of said individual speaker model for each segment in said memory in conjunction with said storage medium location address for that segment.
4 Assignments
0 Petitions
Accused Products
Abstract
A processor controlled system for correlating an electronic index according to speaker for audio data being recorded in real time. The system includes a source of training data for each of the plurality of individual speakers and audio input system for providing real time audio data including speech for the individual speakers. The audio data is converted into spectral feature data by an audio processor, and is simultaneously recorded on a storage medium by a recording device. A system processor accepts the training data to create individual speaker models, which are combined in parallel to form a speaker network. The system processor then accepts the spectral feature data of the audio data and, using the speaker network, determines segments in the audio data corresponding to each speaker.
-
Citations
8 Claims
-
1. A processor controlled system for electronically indexing audio data being recorded in real time, said audio data comprising speech from a plurality of individual speakers, the processor controlled system electronically indexing the audio data according to speaker, the system comprising:
-
a training data source for providing spectral feature training data for each of the plurality of individual speakers; a system processor for receiving the training data and producing a speaker model for each of the plurality of individual speakers, each speaker model having an associated speaker identifier, said system processor further combining said speaker models into a speaker network; an audio input system for providing real time audio data comprising speech from the plurality of individual speakers; an audio processor for receiving said audio data and converting said audio data into spectral feature data; a recording device for receiving said audio data and recording said audio data on a storage medium according to a received time; memory for storing data, the data stored in the memory including instruction data indicating instructions the processor executes; said system processor further accessing the data stored in the memory; said system processor, in executing the instructions, receiving said spectral feature data from said audio processor and, using said speaker network, determining segments of said audio data which correspond to different individual speaker models; said system processor further determining at the start of each segment a timestamp, said timestamp corresponding to the received time for that segment on said storage medium, said system processor storing said timestamp in said memory; said system processor further storing said speaker identifier of said individual speaker model for each segment in said memory in conjunction with said storage medium location address for that segment. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
Specification