AUTOMATED HYPOTHESIS TESTING

US 20110004442A1
Filed: 09/09/2010
Published: 01/06/2011
Est. Priority Date: 04/11/2006
Status: Active Grant

First Claim

Patent Images

1. A method of automatically applying hypothesis testing to at least one data set, the method comprising using a computer to carry out the steps of:

providing a plurality of statistical tests applicable to the at least one data set, the tests having a variety of characteristics and associated conventions;

seeking an indication as to at least one characteristic of the data set;

selecting a test from among the plurality of tests based on the indications and associated conventions;

providing a notification of the nature of the selected test, the indications and established conventions;

characterizing the data set, including establishing test criteria, selecting an appropriate reference test value depending on the established test criteria; and

eliciting an indication of a description of the data of interest;

constructing a null hypothesis statement and an alternative hypothesis statement;

receiving an indication of a significance level,conducting the selected test includingcalculating the values of the test statistic, reference values, confidence bounds and p-values,comparing the calculated p-value to the indicated significance level,comparing the value of the test statistic to one or more reference values, andassessing the confidence bounds in view of the null hypothesis statement;

stating a conclusion based on the selected test and the indicated test criteria, the conclusion including whether to reject the null hypothesis or not reject the null hypothesis, and the basis for the conclusion.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of automatically applying a hypothesis test to a data set. The method reduces errors made in failing to appreciate predicate assumptions of various statistical tests, and elicits a series of indications from the user regarding characteristics of interest embodied by the data set to select an appropriate statistical test. The system also reduces errors in constructing competing null and alternative hypothesis statements by generating a characterization of the data and defining null and alternative hypotheses according to the indications, selected statistical test, and conventions adopted with respect to the tests. The system also establishes a significance level, calculates the test statistic, and generates an output. The output of the system provides a plain interpretation of the quantitative results in the terms indicated by the user to reduce errors in interpretation of the conclusion.

13 Citations

View as Search Results

19 Claims

1. A method of automatically applying hypothesis testing to at least one data set, the method comprising using a computer to carry out the steps of:
- providing a plurality of statistical tests applicable to the at least one data set, the tests having a variety of characteristics and associated conventions;
  
  seeking an indication as to at least one characteristic of the data set;
  
  selecting a test from among the plurality of tests based on the indications and associated conventions;
  
  providing a notification of the nature of the selected test, the indications and established conventions;
  
  characterizing the data set, including establishing test criteria, selecting an appropriate reference test value depending on the established test criteria; and
  
  eliciting an indication of a description of the data of interest;
  
  constructing a null hypothesis statement and an alternative hypothesis statement;
  
  receiving an indication of a significance level,conducting the selected test includingcalculating the values of the test statistic, reference values, confidence bounds and p-values,comparing the calculated p-value to the indicated significance level,comparing the value of the test statistic to one or more reference values, andassessing the confidence bounds in view of the null hypothesis statement;
  
  stating a conclusion based on the selected test and the indicated test criteria, the conclusion including whether to reject the null hypothesis or not reject the null hypothesis, and the basis for the conclusion.
- View Dependent Claims (2, 3)
- - 2. The method set forth in claim 1 including establishing the convention of constructing the null and alternative hypotheses in mathematical terms, and where the convention constructs the null hypothesis statement as an equality and the alternative hypothesis statement as an inequality.
  - 3. The method set forth in claim 1 wherein determining the test includes seeking an indication as to the time ordered nature of the data set including generating a notification of process stability, seeking an indication of the nature of data as being attribute data or continuous data and, if the indication is that the data are attribute data, then seeking an indication as to the number of samples from which the data are drawn, an indication of sample size, and seeking an indication of normality of the data, wherein, if the indication is that the data are continuous, then seeking an indication as to the number of samples from which the data are drawn, an indication of sample size and an indication as to whether the data are normal, not normal, or if the distribution is unknown, wherein, if the indication is that the data are one of not normal or unknown, then providing notifications to use one of a normality test to determine normality, non-parametric tests, and data transformation functions.

4. A non-transitory computer readable medium accessible by a computer processor including a software program for automatically applying hypothesis testing to at least one data set, the software including modules for:
- A. providing a plurality of statistical tests applicable to the at least one data set having a variety of characteristics;
  
  B. associating conventions with each test of the plurality of tests;
  
  C. determining the test including seeking indications as to at least one characteristic of the data set, generating at least one notification in response to the indications, and selecting a test from among the plurality of tests based on the indications and established conventions, and providing a notification of the nature of the selected test, the indications and established conventions;
  
  D. characterizing the data set, including establishing test criteria, selecting an appropriate reference test value depending on the test selected; and
  
  eliciting an indication of a description of the data of interest;
  
  E. constructing null and alternative hypothesis statements;
  
  F. obtaining a significance level,G. conducting the selected test including calculating the values of the test statistic, reference values, confidence bounds and p-values, comparing the calculated p-value to the indicated significance level and the value of the test statistic to one or more reference values, and assessing the confidence bounds in view of the null hypothesis statement;
  
  H. stating a conclusion in terms of the selected test and indicated test criteria whether to reject the null hypothesis or not to reject the null hypothesis and stating the basis for the conclusion using the results of the conducting step.
- View Dependent Claims (5, 6, 7)
- - 5. The computer readable medium of claim 4 wherein the modules construct the null and alternative hypothesis statements in mathematical terms, and the modules establish the convention of constructing the null as an equality and the alternative hypothesis statement as an inequality
  - 6. The computer readable medium of claim 4 wherein determining the test includes seeking an indication as to the time ordered nature of the data set and the modules generate a notification of process stability.
  - 7. The computer readable medium of claim 4 wherein determining the test includes seeking an indication of the nature of data as being attribute data or continuous data and, if the indication is that the data are attribute data, then seeking an indication as to the number of samples from which the data are drawn, seeking an indication of sample size, and seeking an indication of normality of the data, wherein, if the indication is that the data are continuous, then seeking an indication as to the number of samples from which the data are drawn, an indication of sample size and an indication as to whether the data are normal, not normal, or if the distribution is unknown, wherein, if the indication is that the data are one of not normal or unknown, then providing notifications to use one of a normality test to determine normality, non-parametric tests, and data transformation functions.

8. A method for conducting a hypothesis test on at least one data set, the method comprising using a computer to carry out the steps of:
- A. providing a plurality of statistical tests applicable to the at least one data set, each test of the plurality of tests having a variety of characteristics and establishing conventions associated with each test of the plurality of tests;
  
  B. selecting an appropriate test based on characteristics of the data set and the tests;
  
  C. establishing test criteria, selecting an appropriate reference test value depending on the test selected, and eliciting an indication of a description of the data of interest;
  
  D. constructing null and alternative hypothesis statements in mathematical terms, including defining the null hypothesis based on the selected test and assumed conventions relating to the selected test and the indications of the test criteria and population description, providing a notification of the null hypothesis statement;
  
  E. seeking an indication of a significance level;
  
  F. conducting the selected test; and
  
  G. stating a conclusion, including calculating the values of the test statistic, calculating cut-off values, confidence intervals, and calculating p-values;
  
  comparing the calculated p-value to the indicated significance level, comparing the value of the test statistic to one or more of the reference values, the cut-off values or confidence intervals in view of the null hypothesis statement; and
  
  stating a conclusion in terms of the selected test, indicated test criteria and population descriptions whether to reject the null hypothesis or not to reject the null hypothesis based on the comparing step, and stating the basis for the conclusion using the results of the comparing step and the population descriptions.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The method set forth in claim 8 and including defining a statement for the alternative hypothesis based on the selected test and assumed conventions relating to the selected test and indications of the test criteria, providing a notification of the alternative hypothesis statement, wherein establishing conventions includes constructing the null as an equality and the alternative hypothesis statement as an inequality.
  - 10. The method set forth in claim 8 wherein selecting the test includes seeking an indication as to the time ordered nature of the data set, and generating a notification of process stability.
  - 11. The method set forth in claim 8 wherein selecting the test includes seeking an indication of the nature of data as being attribute data or continuous data, wherein, when the data are attribute data, selecting the test includes seeking an indication as to the number of samples from which the data are drawn, seeking an indication of sample size, and seeking an indication of normality of the data, wherein, when the data is continuous, seeking an indication as to the number of samples from which the data are drawn, seeking an indication of sample size, seeking an indication as to whether the data are normal, not normal, or if normalcy is unknown, and wherein, when the data is one of not normal and unknown, selecting the test includes notifying a user to use one of a normality test to determine normality, non-parametric tests, and data transformation functions.
  - 12. The method set forth in claim 8 wherein selecting the test includes identifying a statistical parameter of interest, and selecting the parameter of interest from among the proportion, mean, and variance of the data.
  - 13. The method set forth in claim 8 wherein selecting the test includes seeking an indication of samples from which the data are drawn and seeking, based on the number of samples indicated, an indication of whether the data sample includes paired data or differences between paired data samples, and wherein, when the parameter of interest is indicated as being the mean, selecting the test includes, seeking an indication of whether variance of population is known.
  - 14. The method set forth in claim 8 wherein selecting the test includes selecting a test from the plurality of tests based on elicited indications and established conventions, and providing a notification of the nature of the selected test, the indications and established conventions.

15. A method for applying statistical analysis to at least one data set, the method comprising using a computer to carry out the steps of:
- A. selecting a plurality of statistical analyses applicable to the at least one data set,B. associating each analysis of the plurality of analyses with a variety of characteristics and established conventions;
  
  C. selecting the appropriate analysis including seeking indications regarding characteristics of the data set and the analyses among the plurality of statistical analyses and providing one or more responses based on the indications;
  
  D. conducting the selected analysis; and
  
  E. stating a conclusion in terms of the selected analysis, indicated analysis criteria and data descriptions, and stating the basis for the conclusion using the indications and data descriptions.

16. A method of automatic analysis of at least one data set by applying a hypothesis test to find differences in the at least one data set between a sample and a target, between two samples or between multiple samples, the method comprising using a computer to carry out the steps of:
- defining a plurality of statistical tests applicable to the at least one data set and establishing conventions associated with each of the plurality of statistical tests;
  
  selecting a test from among the plurality of tests including seeking an indication as to the objective of the analysis and an indication as to at least one characteristic of the data set;
  
  providing a notification of the nature of the selected test and at least one of the indications;
  
  eliciting information characterizing the data set, including eliciting an indication of a description of the data of interest;
  
  eliciting information for constructing the test;
  
  conducting the selected test; and
  
  stating a conclusion in terms of the information characterizing the data set, the conclusion including a statement reciting a basis for the conclusion.

17. A method of automatic analysis of at least one data set by applying a hypothesis test to find differences in the at least one data set between a sample and a target, between two samples or between multiple samples, the method comprising using a computer to carry out the steps of:
- defining a plurality of statistical tests applicable to the at least one data set and establishing conventions associated with each of the plurality of statistical tests;
  
  selecting a test from among the plurality of tests including seeking an indication as to the objective of the analysis and an indication as to at least one characteristic of the data set;
  
  eliciting information characterizing the data set, including eliciting an indication of a description of the data of interest;
  
  eliciting information for constructing the test, including formulating an expression of interest in terms of the information characterizing the data set;
  
  conducting the selected test including calculating at least one of summary statistics, confidence intervals, and p-values; and
  
  stating a conclusion based on the selected test and the elected expression in terms of the information characterizing the data set, the conclusion including a statement reciting a basis for the conclusion, the basis including explanatory statements incorporating at least one of summary statistics, confidence intervals and p-values.

18. A method of automatically providing an analysis of at least one data set by applying a hypothesis test to the at least one data set, the method comprising using a computer to carry out the steps of:
- defining a plurality of statistical tests applicable to the at least one data set and establishing conventions and descriptive statistics associated with each of the plurality of statistical tests;
  
  selecting a test from among the plurality of tests including seeking an indication as to the objective of the analysis and an indication as to at least one characteristic of the data set;
  
  providing a notification of the nature of the selected test and at least one of the indications;
  
  eliciting information for characterizing the data set, including eliciting an indication of a description of the data of interest;
  
  eliciting information for constructing the test, including formulating an expression of interest in terms of the information characterizing the data set;
  
  conducting the selected test for the at least one data set according to the conventions for the selected test and the, including calculating the statistics associated with the selected statistical tests; and
  
  stating a conclusion based on the selected test and the expression of interest in terms of the information characterizing the data set, the conclusion including a statement reciting at least one descriptive statistic associated with the selected statistical test in terms of the information elicited to characterize the data set.

19. A method of automatic analysis of at least one data set by applying a hypothesis test to find differences in the at least one data set between a sample and a target, between two samples or between multiple samples, the method comprising using a computer to carry out the steps of:
- defining a plurality of statistical tests applicable to the at least one data set and establishing conventions associated with each of the plurality of statistical tests;
  
  selecting a test from among the plurality of tests including seeking an indication as to the objective of the analysis and an indication as to at least one characteristic of the data set;
  
  providing a notification of the nature of the selected test and at least one of the indications;
  
  eliciting information characterizing the data set, including eliciting an indication of a description of the data of interest;
  
  eliciting information for constructing the test, including formulating one or more alternative expressions of interest in terms of the indicated objective of the analysis and the information characterizing the data set and eliciting an election of an expression of interest;
  
  conducting the selected test; and
  
  stating a conclusion based on the selected test and the expression of interest in terms of the information characterizing the data set, the conclusion including a statement reciting a basis for the conclusion.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
MoreSteam.com LLC
Original Assignee
MoreSteam.com LLC
Inventors
Hathaway, William M.

Granted Patent

US 8,050,888 B2
Time in Patent Office

Days
Field of Search
US Class Current

702/179
CPC Class Codes

G06F 17/18 for evaluating statistical ...

G06F 3/0484 for the control of specific...

AUTOMATED HYPOTHESIS TESTING

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

13 Citations

19 Claims

Specification

Use Cases

Quick Links

Others

AUTOMATED HYPOTHESIS TESTING

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

13 Citations

19 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others