Real-time audio recording system for automatic speaker indexing

US 5,606,643 A
Filed: 04/12/1994
Issued: 02/25/1997
Est. Priority Date: 04/12/1994
Status: Expired due to Term

First Claim

Patent Images

1. A processor controlled system for electronically indexing audio data being recorded in real time, said audio data comprising speech from a plurality of individual speakers, the processor controlled system electronically indexing the audio data according to speaker, the system comprising:

a training data source for providing spectral feature training data for each of the plurality of individual speakers;

a system processor for receiving the training data and producing a speaker model for each of the plurality of individual speakers, each speaker model having an associated speaker identifier, said system processor further combining said speaker models into a speaker network;

an audio input system for providing real time audio data comprising speech from the plurality of individual speakers;

an audio processor for receiving said audio data and converting said audio data into spectral feature data;

a recording device for receiving said audio data and recording said audio data on a storage medium according to a received time;

memory for storing data, the data stored in the memory including instruction data indicating instructions the processor executes;

said system processor further accessing the data stored in the memory;

said system processor, in executing the instructions, receiving said spectral feature data from said audio processor and, using said speaker network, determining segments of said audio data which correspond to different individual speaker models;

said system processor further determining at the start of each segment a timestamp, said timestamp corresponding to the received time for that segment on said storage medium, said system processor storing said timestamp in said memory;

said system processor further storing said speaker identifier of said individual speaker model for each segment in said memory in conjunction with said storage medium location address for that segment.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A processor controlled system for correlating an electronic index according to speaker for audio data being recorded in real time. The system includes a source of training data for each of the plurality of individual speakers and audio input system for providing real time audio data including speech for the individual speakers. The audio data is converted into spectral feature data by an audio processor, and is simultaneously recorded on a storage medium by a recording device. A system processor accepts the training data to create individual speaker models, which are combined in parallel to form a speaker network. The system processor then accepts the spectral feature data of the audio data and, using the speaker network, determines segments in the audio data corresponding to each speaker.

Citations

8 Claims

1. A processor controlled system for electronically indexing audio data being recorded in real time, said audio data comprising speech from a plurality of individual speakers, the processor controlled system electronically indexing the audio data according to speaker, the system comprising:
- a training data source for providing spectral feature training data for each of the plurality of individual speakers;
  
  a system processor for receiving the training data and producing a speaker model for each of the plurality of individual speakers, each speaker model having an associated speaker identifier, said system processor further combining said speaker models into a speaker network;
  
  an audio input system for providing real time audio data comprising speech from the plurality of individual speakers;
  
  an audio processor for receiving said audio data and converting said audio data into spectral feature data;
  
  a recording device for receiving said audio data and recording said audio data on a storage medium according to a received time;
  
  memory for storing data, the data stored in the memory including instruction data indicating instructions the processor executes;
  
  said system processor further accessing the data stored in the memory;
  
  said system processor, in executing the instructions, receiving said spectral feature data from said audio processor and, using said speaker network, determining segments of said audio data which correspond to different individual speaker models;
  
  said system processor further determining at the start of each segment a timestamp, said timestamp corresponding to the received time for that segment on said storage medium, said system processor storing said timestamp in said memory;
  
  said system processor further storing said speaker identifier of said individual speaker model for each segment in said memory in conjunction with said storage medium location address for that segment.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The system of claim 1, said system processor producing for each of the plurality of individual speakers an individual Hidden Markov Model (HMM) speaker model.
  - 3. The system of claim 2, said system processor further combining said individual Hidden Markov Model (HMM) speaker models in parallel to construct a speaker network HMM.
  - 4. The system of claim 3, wherein said system processor, in determining segments of said audio data which correspond to different speaker models, determines an optimal path through the speaker network HMM, identifying segments of said audio data associated with different individual HMM speaker models.
  - 5. The system of claim 1, wherein said system processor further collects a set of segments associated with each speaker identifier.
  - 6. The system of claim 5, wherein said recording device is further connected for providing said recorded audio data to said audio processor to be converted into new spectral feature data;
    - said system processor using collected segments associated with each speaker identifier to produce new speaker models for each of the plurality of individual speakers, said system processor further combining said new speaker models into a new speaker network;
      
      said system processor further receiving said new spectral feature data and, using said new speaker network, determining segments of said audio data which correspond to different new individual speaker models;
      
      said system processor further determining at the start of each segment a timestamp, said timestamp corresponding to the received time for that segment on said storage medium, said system processor storing said timestamp in said memory;
      
      said system processor further storing said speaker identifier of said new individual speaker model for each segment in said memory in conjunction with said timestamp for that segment.
  - 7. The system of claim 1, wherein said system processor, in receiving said training data, further produces a garbage model from portions of each training data for each of the plurality of individual speakers, said garbage model further being combined into said speaker network.
  - 8. The system of claim 1, wherein said system processor further produces a silence model, said silence model further being combined into said speaker network.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Xerox Corporation (Xerox Holdings Corp.)
Original Assignee
Xerox Corporation (Xerox Holdings Corp.)
Inventors
Kimber, Donald G., Balasubramanian, Vijay, Poon, Alex D., Weber, Karon A., Chou, Philip A., Chen, Francine R., Wilcox, Lynn D.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
Chawan, Vijay B.

Application Number

US08/226,580
Time in Patent Office

1,050 Days
Field of Search

381/41-43, 381/82, 395/2, 395/2.52
US Class Current

704/243
CPC Class Codes

G10L 17/00   Speaker identification or v...

G11B 27/28   by using information signal...

G11B 27/3036   Time code signal

Real-time audio recording system for automatic speaker indexing

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Real-time audio recording system for automatic speaker indexing

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links