×

Efficient determination of sample size to facilitate building a statistical model

  • US 7,409,371 B1
  • Filed: 06/04/2001
  • Issued: 08/05/2008
  • Est. Priority Date: 06/04/2001
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer implemented system that facilitates building a statistical model for a computer readable data set, comprising:

  • a first training method that efficiently builds a rough statistical model from a subset of the computer readable data set capable of statistical characterization;

    an evaluation component that evaluates the rough statistical model to determine whether the subset of the computer readable data set is an appropriate subset to be utilized to build a refined statistical model for the computer readable data set based at least in part on stopping criterion to facilitate reducing cost of clustering data relative to the computer readable data set;

    a second training method that builds the refined statistical model for the computer readable data set from the subset if the subset is deemed appropriate by the evaluation component, the refined statistical model provides a more accurate modeling of the subset than the rough statistical model and facilitates determining good clustering of data for a fixed number of clusters based at least in part on predefined accuracy criteria to facilitate clustering of data relative to the computer readable data set, wherein the clustered data is provided; and

    a data scheduler that, based at least in part on a data policy, adaptively controls the size of subsets for which the first training method is applied to facilitate building the refined statistical model.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×