METHODS FOR EFFICIENT CLUSTER ANALYSIS
First Claim
1. A computer readable medium storing a computer program which when executed by at least one processor defines structure for a document comprising a plurality of primitive elements that are defined in terms of their position in the document, the computer program comprising sets of instructions for:
- identifying, for a particular set of primitive elements, a pairwise grouping of nearest primitive elements in the set;
sorting the pairwise groupings of primitive elements based on an order from the closest to the furthest pairs;
storing a single value that identifies which of the pairwise groupings of primitive elements are sufficiently far apart to form a partition; and
using the stored single value to identify and analyze the partition in order to define structural elements for the document.
1 Assignment
0 Petitions
Accused Products
Abstract
Some embodiments provide a method for defining structure for an unstructured document that includes a number of primitive elements that are defined in terms of their position in the document. The method identifies a pairwise grouping of nearest primitive elements. The method sorts the pairwise primitive elements based on an order from the closest to the furthest pairs. The method stores a single value that identifies which of the pairwise primitive elements are sufficiently far apart to form a partition. The method uses the stored value to identify and analyze the partitions in order to define structural elements for the document.
-
Citations
23 Claims
-
1. A computer readable medium storing a computer program which when executed by at least one processor defines structure for a document comprising a plurality of primitive elements that are defined in terms of their position in the document, the computer program comprising sets of instructions for:
-
identifying, for a particular set of primitive elements, a pairwise grouping of nearest primitive elements in the set; sorting the pairwise groupings of primitive elements based on an order from the closest to the furthest pairs; storing a single value that identifies which of the pairwise groupings of primitive elements are sufficiently far apart to form a partition; and using the stored single value to identify and analyze the partition in order to define structural elements for the document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method for defining a program for defining structure for a document, the method comprising:
-
defining a module for identifying a pairwise grouping of nearest primitive elements in a document comprising a plurality of primitive elements that are defined in terms of their position in the document; defining a module for sorting the pairwise groupings of primitive elements based on an order from the closest to the furthest pairs; defining a module for storing a single value that identifies which of the pairwise-grouped primitive elements are sufficiently far apart to form a partition; and defining a module for using the stored value to identify and analyze the partitions in order to define structural elements for the document. - View Dependent Claims (22, 23)
-
Specification