Statistical models for improving the performance of database operations

US 7,149,649 B2
Filed: 05/15/2002
Issued: 12/12/2006
Est. Priority Date: 06/08/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A method for an automatic software-driven statistical evaluation of a large amount of data to be assigned to statistical variables in a database contained in at least one cluster, the method comprising:

developing a statistical model that approximately describes at least one relative frequency of the states of the statistical variables and a statistical dependency between the states of the statistical variables;

determining an approximate relative frequency of the states of the statistical variables and an approximate relative frequency belonging to an at least one pre-determined relative frequency of the states of the statistical variables and an expected value of the states of the statistical variables dependent thereon by using data stored in the database and the statistical model;

and performing a statistical evaluation of at least one of;

(a) customer data in a Web reporting/Web mining area;

(b) customer data in a customer relationship management system;

(c) an environmental database;

(d) a medical database; and

(e) a genome database;

and outputting results of the statistical evaluation.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for performing an automatic software-driven statistical evaluation of a large amount of data to be assigned to statistical variables in a database contained in at least one cluster. The method is characterized by using a statistical model to model an approximate description of a relative frequency of the state or states of the statistical variables and a statistical dependencies between the state or states, and then determining the approximate relative frequency of the state or states of the statistical variables and the approximate relative frequency belonging to a predetermined relative frequency of the state or states of the statistical variables and an expected value of the state or states of the statistical variables dependent thereon.

16 Citations

View as Search Results

16 Claims

1. A method for an automatic software-driven statistical evaluation of a large amount of data to be assigned to statistical variables in a database contained in at least one cluster, the method comprising:
- developing a statistical model that approximately describes at least one relative frequency of the states of the statistical variables and a statistical dependency between the states of the statistical variables;
  
  determining an approximate relative frequency of the states of the statistical variables and an approximate relative frequency belonging to an at least one pre-determined relative frequency of the states of the statistical variables and an expected value of the states of the statistical variables dependent thereon by using data stored in the database and the statistical model;
  
  and performing a statistical evaluation of at least one of;
  
  (a) customer data in a Web reporting/Web mining area;
  
  (b) customer data in a customer relationship management system;
  
  (c) an environmental database;
  
  (d) a medical database; and
  
  (e) a genome database;
  
  and outputting results of the statistical evaluation.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The method according to claim 1, wherein the statistical model is a graphical probabilistic model.
  - 3. The method according to claim 1, wherein the statistical model is a Bayesian network.
  - 4. The method according to claim 1, wherein a statistical clustering model algorithm is used to subdivide the data into a plurality of clusters.
  - 5. The method according to claim 1, wherein a distance-based clustering algorithm is used to subdivide the data into a plurality of clusters.
  - 6. The method according to claim 4, wherein the data considered is restricted to the data contained in at least one cluster.
  - 7. The method according to claim 5, wherein the data considered is restricted to the data contained in at least one cluster.
  - 8. The method according to claim 6, wherein the data belonging to the at least one cluster is restricted to specific states of statistical variables having at least one specific relative frequency.
  - 9. The method according to claim 7, wherein the data belonging to the at least one cluster is restricted to specific states of statistical variables having at least one specific relative frequency.
  - 10. The method according to claim 5, wherein the data belonging to the at least one cluster is stored on a data carrier respective to a cluster affiliation.
  - 11. The method according to claim 9, wherein the data belonging to the at least one cluster is stored on a data carrier respective to a cluster affiliation.
  - 12. The method according to claim 1, wherein a database reporting method or a OLAP method is used to determine the relative frequencies and the expected value of the states of statistical variables.
  - 13. The method according to claim 11, wherein a database reporting method or a OLAP method is used to determine the relative frequencies and the expected value of the states of statistical variables.
  - 14. The method according to claim 12, wherein the database reporting method or the OLAP method is used when a test variable equals or exceeds a predetermined value.
  - 15. The method according to claim 13, wherein the database reporting method or the OLAP method is used when a test variable equals or exceeds a predetermined value.

16. A method for an automatic software-driven statistical evaluation of a large amount of data to be assigned to statistical variables in a database contained in one or several clusters comprising;
- subdividing the data into many clusters by a distance-based clustering algorithm, wherein the data considered is restricted to the data contained in at least one cluster;
  
  determining at least one relative frequency and at least one expected value of states of statistical variables by using a database reporting method or a OLAP method;
  
  and performing a statistical evaluation of at least one of;
  
  (a) customer data in a Web reporting/Web mining area;
  
  (b) customer data in a customer relationship management system;
  
  (c) an environmental database;
  
  (d) a medical database; and
  
  (e) a genome database;
  
  and outputting results of the statistical evaluation.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Panoratio Database Images GmbH (Panoratio Holdings, Inc.)
Original Assignee
Panoratio Database Images GmbH (Panoratio Holdings, Inc.)
Inventors
Hofmann, Reimar, Haft, Michael
Primary Examiner(s)
Nghiem, Michael
Assistant Examiner(s)
Washburn, Douglas N

Application Number

US10/479,991
Publication Number

US 20040186684A1
Time in Patent Office

1,672 Days
Field of Search

702/179, 702/181, 702/199
US Class Current

702/179
CPC Class Codes

G06F 16/2465 Query processing support fo...

G06F 16/30 of unstructured textual dat...

Statistical models for improving the performance of database operations

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

16 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Statistical models for improving the performance of database operations

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

16 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links