Method and apparatus for generating test data sets in accordance with user feedback
First Claim
1. A method of generating at least one test data set from at least one input data set, the test data set being capable of determining quality of results from a data mining technique performed in a computer, the method comprising the steps of:
- obtaining the at least one input data set; and
constructing the at least one test data set from the at least one input data set based on characteristics associated with the at least one input data set and user input resulting in at least a changing of a shape of one or more clustered regions in data of the at least one input data set.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques for processing data sets and, more particularly, constructing a synthetic data set (test data set) from real data sets (input data sets) in accordance with user feedback. The technique mimics real data sets effectively to generate the corresponding synthetic ones. Multiple real data sets may be used to create a test data set which combines the characteristics of these multiple data sets. Users of the technique have the ability to modify the characteristics of the data sets to create a new data set which has features that a user may desire. For example, a user may change the shape or size of, or distort the different patterns in the data to create a new data set. A user may also choose to inject noise into the system.
-
Citations
23 Claims
-
1. A method of generating at least one test data set from at least one input data set, the test data set being capable of determining quality of results from a data mining technique performed in a computer, the method comprising the steps of:
-
obtaining the at least one input data set; and constructing the at least one test data set from the at least one input data set based on characteristics associated with the at least one input data set and user input resulting in at least a changing of a shape of one or more clustered regions in data of the at least one input data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. Apparatus for generating at least one test data set from at least one input data set, the test data set being capable of determining quality of results from a data mining technique performed in a computer, the apparatus comprising:
-
a memory; and at least one processor coupled to the memory and operative to;
(i) obtain at least one input data set; and
(ii) construct at least one test data set from the at least one input data set based on characteristics associated with the at least one input data set and user input resulting in at least a changing of a shape of one or more clustered regions in data of the at least one input data set. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. An article of manufacture for generating at least one test data set from at least one input data set, the test data set being capable of determining quality of results from a data mining technique performed in a computer, the article of manufacture comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
obtaining the at least one input data set; and constructing the at least one test data set from the at least one input data set based on characteristics associated with the at least one input data set and user input resulting in at least a changing of a shape of one or more clustered regions in data of the at least one input data set.
-
Specification