Textual on-line analytical processing method and system
First Claim
Patent Images
1. A method for processing unstructured documents to populate an OLAP data structure, the method comprising:
- selecting a plurality of unstructured documents from a corpus of unstructured documents;
computing a document representation for each selected document;
organizing said selected documents into a hierarchy of document clusters based on said document representations;
populating the OLAP data structure using said hierarchy of document clusters, and;
computing a document measure for each selected document.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides for a system and method that allows OLAP analysis of unstructured content. This is accomplished by transforming isolated, unstructured content into quantifiable structured data, thereby creating a common measure for performing OLAP analysis. This allows the seamless integration of unstructured content with structured data sources. It also allows for the ability to query what was before unqueriable information that enterprises were in possession of.
-
Citations
78 Claims
-
1. A method for processing unstructured documents to populate an OLAP data structure, the method comprising:
-
selecting a plurality of unstructured documents from a corpus of unstructured documents;
computing a document representation for each selected document;
organizing said selected documents into a hierarchy of document clusters based on said document representations;
populating the OLAP data structure using said hierarchy of document clusters, and;
computing a document measure for each selected document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A computer readable medium containing computer executable instructions for processing unstructured documents to populate an OLAP data structure, the computer readable medium comprising:
-
a selection module for;
selecting a plurality of unstructured documents from a corpus of unstructured documents;
a representation module for;
computing a document representation for each selected document; and
an organization module for;
organizing said selected documents into a hierarchy of document clusters based on said document representations;
populating the OLAP data structure using said hierarchy of document clusters, and;
computing a document measure for each selected document. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52)
-
-
53. A computing apparatus for processing unstructured documents to populate an OLAP data structure, the computing apparatus operative to:
-
select a plurality of unstructured documents from a corpus of unstructured documents;
compute a document representation for each selected document;
organize said selected documents into a hierarchy of document clusters based on said document representations;
populate the OLAP data structure using said hierarchy of document clusters, and;
compute a document measure for each selected document. - View Dependent Claims (54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78)
-
Specification