System and method for the automatic discovery of salient segments in speech transcripts

US 6,928,407 B2
Filed: 03/29/2002
Issued: 08/09/2005
Est. Priority Date: 03/29/2002
Status: Active Grant

First Claim

Patent Images

1. A method automatically discovering salient segments in a speech transcript, comprising:

performing a first segmentation of the speech transcript using a boundary-based process to generate a sequence of first segments, indicative of a temporal proximity of features in the speech;

performing a second segmentation of the first segments for determining a rate of arrival of the features, and for generating a sequence of second segments; and

performing a third segmentation of the second segments using a content-based process to generate a sequence of third segments, to minimize oversegmentation.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and associated method automatically discover salient segments in a speech transcript and focus on the segmentation of an audio/video source into topically cohesive segments based on Automatic Speech Recognition (ASR) transcriptions. The word n-grams are extracted from the speech transcript using a three-phase segmentation algorithm based on the following sequence or combination of boundary-based and content-based methods: a boundary-based method; a rate of arrival of feature method; and a content-based method. In the first two segmentation passes, the temporal proximity and the rate of arrival of features are analyzed to compute an initial segmentation. In the third segmentation pass, changes in the set of content-bearing words used by adjacent segments are detected, to validate the initial segments for merging them, to prevent over-segmentation.

Citations

22 Claims

1. A method automatically discovering salient segments in a speech transcript, comprising:
- performing a first segmentation of the speech transcript using a boundary-based process to generate a sequence of first segments, indicative of a temporal proximity of features in the speech;
  
  performing a second segmentation of the first segments for determining a rate of arrival of the features, and for generating a sequence of second segments; and
  
  performing a third segmentation of the second segments using a content-based process to generate a sequence of third segments, to minimize oversegmentation.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein performing the third segmentation comprises performing segmentation of adjacent segments.
  - 3. The method of claim 2, wherein the speech transcript contains audio data.
  - 4. The method of claim 2, wherein the speech transcript contains video data.
  - 5. The method of claim 4, wherein performing the first segmentation comprises using n-grams for the video data and a mixture of n-grams and content words and noun phrases as a feature set to reduce noise features.
  - 6. The method of claim 1, wherein the features comprise technical terms.
  - 7. The method of claim 1, further comprising modifying the features based on a genre of input data in the speech transcript.
  - 8. The method of claim 2, wherein performing the third segmentation comprises merging at least some of the second segments.

9. A computer program for automatically discovering salient segments in a speech transcript, comprising:
- a first set of program instructions for performing a first segmentation of the speech transcript using a boundary-based process to generate a sequence of first segments, indicative of a temporal proximity of features in the speech;
  
  a second set of program instructions for performing a second segmentation of the first segments for determining a rate of arrival of the features, and for generating a sequence of second segments; and
  
  a third set of program instructions for performing a third segmentation of the second segments using a content-based process to generate a sequence of third segments, to minimize oversegmentation.
- View Dependent Claims (10, 11, 12, 13)
- - 10. The computer program of claim 9, wherein the third set of program instructions merges at least some adjacent second segments.
  - 11. The computer program of claim 10, wherein the speech transcript contains audio data.
  - 12. The computer program of claim 10, wherein the speech transcript contains video data.
  - 13. The computer program of claim 9, wherein the features comprise technical terms.

14. A system for automatically discovering salient segments in a speech transcript, comprising:
- means for performing a first segmentation of the speech transcript using a boundary-based process to generate a sequence of first segments, indicative of a temporal proximity of features in the speech;
  
  means for performing a second segmentation of the first segments for determining a rate of arrival of the features, and for generating a sequence of second segments; and
  
  means for performing a third segmentation of the second segments using a content-based process to generate a sequence of third segments, to minimize oversegmentation.
- View Dependent Claims (15, 16, 17, 18, 19, 20)
- - 15. The system of claim 14, wherein the means for performing the third segmentation merges at least some adjacent second segments.
  - 16. The system of claim 15, wherein the speech transcript contains audio data.
  - 17. The system of claim 15, wherein the speech transcript contains video data.
  - 18. The system of claim 14, wherein the features comprise technical terms.
  - 19. The system of claim 14, wherein the features comprise word n-grams.
  - 20. The system of claim 19, wherein the first and second segmentations compute a maximum number of segments to be discovered.

21. A method automatically discovering salient segments in a time varying signal, comprising:
- performing a first segmentation of the time varying signal using a boundary-based process to generate a sequence of first segments, indicative of a temporal proximity of features in the speech;
  
  performing a second segmentation of the first segments for determining a rate of arrival of the features, and for generating a sequence of second segments; and
  
  performing a third segmentation of the second segments using a content-based process to generate a sequence of third segments, to minimize oversegmentation.
- View Dependent Claims (22)
- - 22. The method of claim 21, wherein the time varying signal includes a visual component.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Ponceleon, Dulce Beatriz, Srinivasan, Savitha
Primary Examiner(s)
Lerner, Martin

Application Number

US10/109,960
Publication Number

US 20030187642A1
Time in Patent Office

1,229 Days
Field of Search

704/9, 704/252, 704/253, 704/254, 704/255, 704/256, 704/257, 704/270, 707/2, 707/7
US Class Current

704/253
CPC Class Codes

G10L 15/1822 Parsing for meaning underst...

Y10S 707/99937 Sorting

System and method for the automatic discovery of salient segments in speech transcripts

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for the automatic discovery of salient segments in speech transcripts

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links