System and method for organizing, compressing and structuring data for data mining readiness
First Claim
1. A system for performing data mining in a set of binary data arranged as a plurality of data items, wherein each data item has a plurality of bits, each bit in a corresponding one of a plurality of bit positions defined for each data item, the system comprising:
- a computer system having hardware including a processor and data storage, wherein the computer system is programmed to;
arrange the set of binary data in the data storage such that the binary data is in bit position groups, wherein each bit position group corresponds to a different one of the plurality of bit positions and includes one bit from each of the plurality of data items which has that one bit position;
compress the binary data of each bit position group such that each bit position group is represented by a compressed data structure, wherein the set of binary data is represented by a plurality of compressed data structures; and
perform a data mining technique using the plurality of compressed data structures to effect a result of detecting at least one of a non-preselected pattern and a non-preselected relationship among the plurality of data items of the set of binary data.
3 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for performing data mining in a set of binary data arranged as a plurality of data items in which each data item has a plurality of bits, each bit in a corresponding one of a plurality of bit positions. The set of binary data is arranged in the data storage such that the binary data is in bit position groups. Each bit position group corresponds to a different one of the plurality of bit positions and includes bits of the binary data having that bit position. The binary data of each bit position group is compressed to produce data structures representing the set of binary data. A data mining technique is performed using the plurality of compressed data structures.
32 Citations
21 Claims
-
1. A system for performing data mining in a set of binary data arranged as a plurality of data items, wherein each data item has a plurality of bits, each bit in a corresponding one of a plurality of bit positions defined for each data item, the system comprising:
a computer system having hardware including a processor and data storage, wherein the computer system is programmed to; arrange the set of binary data in the data storage such that the binary data is in bit position groups, wherein each bit position group corresponds to a different one of the plurality of bit positions and includes one bit from each of the plurality of data items which has that one bit position; compress the binary data of each bit position group such that each bit position group is represented by a compressed data structure, wherein the set of binary data is represented by a plurality of compressed data structures; and perform a data mining technique using the plurality of compressed data structures to effect a result of detecting at least one of a non-preselected pattern and a non-preselected relationship among the plurality of data items of the set of binary data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
11. A computer system for performing data mining in a set of binary data arranged as a plurality of data items, wherein each data item has a plurality of bits, each bit in a corresponding one of a plurality of bit positions defined for each data item, the computer system comprising:
-
means for arranging the set of binary data in a data storage device accessible by the computer system such that the binary data is in bit position groups wherein each bit position group corresponds to a different one of the plurality of bit positions and includes one bit from each of the plurality of data items which has that one bit position; means for compressing the binary data of each bit position group such that each bit position group is represented by a compressed data structure, wherein the set of binary data is represented by a plurality of compressed data structures; and means for performing a data mining technique using the plurality of compressed data structures to effect a result of detecting at least one of a non-preselected pattern and a non-preselected relationship among the data items of the set of binary data.
-
-
12. A computer-implemented method of performing data mining on data that is initially arranged as a plurality of data items, each data item having a plurality of bits in a plurality of bit positions, each bit being in one of the plurality of bit positions defined for each data item, the method comprising:
-
using a computer system, arranging the data into bit position-based groups in a data store of the computer system, wherein each bit position-based group corresponds to a different one of each of the bit positions and includes one bit from each of the plurality of data items which has that one bit position; and using the computer system, compressing the data of each of the bit position-based groups into a compressed representation to produce a plurality of compressed representations of the data; and using the computer system, performing data mining using the plurality of compressed representations to effect a result of detecting at least one of a non-preselected pattern and a non-preselected relationship among the data items of the set of binary data. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
Specification