Segmentation of audio data for indexing of conversational speech for real-time or postprocessing applications
First Claim
1. A method for segmenting audio data according to speaker, said audio data comprising conversational speech from a plurality of individual speakers, comprising the steps of:
- providing an individual HMM for each individual speaker of the plurality of individual speakers of the audio data, each Hidden Markov Model (HMM) having at least one state;
constructing a speaker network HMM by connecting said individual HMMs in parallel;
segmenting said audio data into segments by determining a most likely sequence of states through the speaker network HMM, each segment being associated with a one of said individual HMMs; and
determining an individual speaker of the plurality of individual speakers of each segment of the path.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for segmenting audio data, comprising speech from a plurality of individual speakers, according to speaker is provided. The method comprises providing individual HMMs for each individual speaker, each individual HMM including at least one state, and constructing a speaker network HMM by connecting the individual HMMs in parallel. The audio data is then divided into segments by determining a most likely sequence of states through the speaker network HMM, each of the segments being associated with one of the individual HMMs. Afterward, the speaker of each of the segments is identified. The segmented data may be used to form an index into the audio data according to speaker.
102 Citations
20 Claims
-
1. A method for segmenting audio data according to speaker, said audio data comprising conversational speech from a plurality of individual speakers, comprising the steps of:
-
providing an individual HMM for each individual speaker of the plurality of individual speakers of the audio data, each Hidden Markov Model (HMM) having at least one state; constructing a speaker network HMM by connecting said individual HMMs in parallel; segmenting said audio data into segments by determining a most likely sequence of states through the speaker network HMM, each segment being associated with a one of said individual HMMs; and determining an individual speaker of the plurality of individual speakers of each segment of the path. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of indexing audio data according to speaker, said audio data comprising conversational speech from a plurality of individual speakers, comprising the steps of:
-
providing an individual Hidden Markov Model (HMM) for each individual speaker of the plurality of individual speakers of the audio data, each individual HMM including at least one state; constructing a speaker network HMM by connecting said individual HMMs in parallel; segmenting said audio data into segments by finding a most likely sequence of states through the speaker network HMM, each segment being associated with a one of the individual HMMs; determining for each segment an individual speaker of the plurality of individual speakers according to said individual HMMs; collecting segments from each individual; and outputting the results of the collected segments. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification