Method and apparatus for segmentation of audio interactions
First Claim
1. A speaker segmentation method for associating an at least one segment of speech for each of at least two sides of a summed audio interaction, with one of the at least two sides of the interaction, using additional information, the method comprising:
- a receiving step for receiving the summed audio interaction from a capturing and logging unit;
a segmentation step for associating the at least one segment with one side of the summed audio interaction, the segmentation step comprisinga parameterization step for transforming a speech signal into a set of feature vectors and dividing the set into non-overlapping segments;
an anchoring step for locating an anchor segment for each of the at least two sides of the summed audio interaction, the anchoring step comprising;
selecting a homogenous segment as a first anchor segment;
constructing a first model of the homogenous segment; and
selecting a second anchor segment such that its model is different from the first model; and
a modeling and classification step for associating at least one second segment with each side of the summed audio interaction; and
a scoring step for assigning a score to said segmentation.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for segmenting an audio interaction, by locating anchor segment from each side of the interaction, iteratively classifying additional segments into one of the two sides, and scoring the resulting segmentation, If the score result is below a threshold, the process is repeated until the segmentation score is satisfactory or until a stopping criterion is met. The anchoring and the scoring steps comprise using additional data associated with the interaction, a speaker thereof, internal or external information related to the interaction or to a speaker thereof or the like.
-
Citations
20 Claims
-
1. A speaker segmentation method for associating an at least one segment of speech for each of at least two sides of a summed audio interaction, with one of the at least two sides of the interaction, using additional information, the method comprising:
-
a receiving step for receiving the summed audio interaction from a capturing and logging unit; a segmentation step for associating the at least one segment with one side of the summed audio interaction, the segmentation step comprising a parameterization step for transforming a speech signal into a set of feature vectors and dividing the set into non-overlapping segments; an anchoring step for locating an anchor segment for each of the at least two sides of the summed audio interaction, the anchoring step comprising; selecting a homogenous segment as a first anchor segment; constructing a first model of the homogenous segment; and selecting a second anchor segment such that its model is different from the first model; and a modeling and classification step for associating at least one second segment with each side of the summed audio interaction; and a scoring step for assigning a score to said segmentation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A speaker segmentation apparatus for associating an at least one segment of speech for each of at least two speakers participating in an audio interaction, with a side of the interaction, using additional information, the apparatus comprising:
-
a segmentation component for associating an at least one segment within the audio interaction with one side of the audio interaction, the segmentation component comprising; a parameterization component for transforming a speech signal into a set of feature vectors and dividing the set into non-overlapping segments; an anchoring component for locating an anchor segment for each of the at least two sides of the audio interaction, the anchoring component selecting a homogenous segment as a first anchor segment, and a second anchor segment having a statistical model different from a statistical model associated with the first anchor segment; and a modeling and classification component for associating at least one second segment with each side of the audio interaction; and a scoring component for assigning a score to said segmentation. - View Dependent Claims (19)
-
-
20. A quality management apparatus for interaction-rich speech environments, the apparatus comprising:
-
a capturing or logging component for capturing or logging an at least one audio interaction in which at least two sides communicate; a segmentation component for segmenting the at least one audio interaction, the segmentation component comprising; a parameterization component for transforming a speech signal into a set of feature vectors and dividing the set into non-overlapping segments; an anchoring component for locating an anchor segment for each of the at least two sides of the at least one audio interaction, the anchoring component selecting a homogenous segment as a first anchor segment, and a second anchor segment having a statistical model different from a statistical model associated with the first anchor segment; and a modeling and classification component for associating at least one second segment with each side of the at least one audio interaction; and a playback component for playing an at least one part of the at least one audio interaction.
-
Specification