System and method for hierarchical segmentation with latent semantic indexing in scale space
First Claim
1. A computer-implemented method for generating a table of contents for a document using information in said document, comprising:
- building a model of said document including an initial semantic structure;
detecting hierarchical changes in said semantic structure spanning different scales byapplying successively smaller scale filter windows to said model according to said initial semantic structure to construct a map of said changes versus scale;
identifying local peaks in said contour map, said peaks being points of maximum vector derivative magnitude;
tracing said local peaks back to a semantic structure change origin point;
measuring a span of scales over which each said change exists; and
ordering said changes into entries in said table of contents based on scale span.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for automatically generating a hierarchical table of contents or outline for indexing a document and identifying clusters of related information in the document. The document may comprise text, audio, video, or a multimedia presentation. The invention employs a unique and novel combination of latent semantic indexing techniques to identify related blocks and major topic changes within the document with scale space segmentation techniques to respectively identify self-similar blocks within the document and to thus find topic changes of various sizes at block edges. The invention then produces a visual presentation of the semantic structure of the document.
-
Citations
4 Claims
-
1. A computer-implemented method for generating a table of contents for a document using information in said document, comprising:
-
building a model of said document including an initial semantic structure; detecting hierarchical changes in said semantic structure spanning different scales by applying successively smaller scale filter windows to said model according to said initial semantic structure to construct a map of said changes versus scale; identifying local peaks in said contour map, said peaks being points of maximum vector derivative magnitude; tracing said local peaks back to a semantic structure change origin point; measuring a span of scales over which each said change exists; and ordering said changes into entries in said table of contents based on scale span. - View Dependent Claims (2)
-
-
3. A system for generating a table of contents for a document using information in said document, comprising:
-
means for building a model of said document including an initial semantic structure; means for detecting hierarchical changes in said semantic structure spanning different scales by applying successively smaller scale filter windows to said model according to said initial semantic structure to construct a map of said changes versus scale; identifying local peaks in said contour map, said peaks being points of maximum vector derivative magnitude; tracing said local peaks back to a semantic structure change origin point;
measuring a span of scales over which each said change exists; andmeans for ordering said changes into entries in said table of contents based on scale span.
-
-
4. A computer program product comprising a machine-readable medium tangibly embodying computer-executable program instructions thereon for generating a table of contents for a document using information in said document, including:
-
a first code means for building a model of said document including an initial semantic structure; a second code means for detecting hierarchical changes in said semantic structure spanning different scales by applying successively smaller scale filter windows to said model according to said initial semantic structure to construct a map of said changes versus scale; identifying local peaks in said contour map, said peaks being points of maximum vector derivative magnitude; tracing said local peaks back to a semantic structure change origin point; measuring a span of scales over which each said change exists; and a third code means for ordering said changes into entries in said table of contents based on scale span.
-
Specification