×

Method and system for business intelligence analytics on unstructured data

  • US 8,266,148 B2
  • Filed: 10/07/2009
  • Issued: 09/11/2012
  • Est. Priority Date: 10/07/2008
  • Status: Active Grant
First Claim
Patent Images

1. A machine-implemented method for a pipelined process of capture, classification and dimensioning of data from a plurality of data sources that include unstructured data having no explicit dimensions associated with the unstructured data to generate a domain-relevant classified data index that is useable by a plurality of different intelligence metrics to perform different kinds of business intelligence analytics, the method comprising:

  • using a data processing machine to collect ingested data as one or more documents from each of the plurality of data sources that include unstructured data and automatically generate and store an ingested data index representing the ingested data that includes at least a hyperlink and extracted meta data for each document;

    using a data processing machine to automatically classify each of the one or more documents into one or more relevance classifications that are stored with the ingested data index for that document to form a domain-relevant classified data index representing the ingested data, wherein the relevance classifications are based on a plurality of dynamically generated topics that are generated in response to machine analysis that includes machine-defined classifiers and in response to machine-prompted user input that distinguishes between user-defined named-entities and user-defined keywords and includes hierarchy information for establishing a hierarchical relationship among the one or more relevance classifications; and

    using a data processing machine to automatically process the plurality of data sources with a plurality of different intelligence metric modules independent of and after the one or more documents have been initially ingested and classified by utilizing the domain-relevant classified data index to generate analytics results that are presented for a user, including processing at least one of the documents in the ingested data with each intelligence metric module based upon a plurality of dimensions abstracted from the relevance classifications and the extracted metadata that includes at least one implicit dimension derived from one or more of the user-defined named-entities,wherein the intelligence metric modules do not modify the ingested data index, and the dynamically generated topics upon which the relevance classifications are based are not determined prior to using the data processing machine to collect ingested data based upon analytic requirements of the intelligence metric modules such that the relevance classifications are separated in the pipelined process from analytic requirements of one or more of the any given intelligence metric modules.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×