System and method for organizing, compressing and structuring data for data mining readiness
First Claim
1. A method of structuring data in a data-mining-ready format, wherein said data has been previously organized in a bit-Sequential (bSQ) format that comprises a plurality of binary files identified by a bit position, said method comprising the steps of:
- dividing each of said plurality of binary files into first quadrants;
recording the count of 1-bits for each first quadrant on a first level;
dividing each of said first quadrants into new quadrants;
recording the count of 1-bits for each of said new quadrants on a new level; and
repeating the two steps immediately above until all of said new quadrants comprise a pure-1 quadrant or a pure-0 quadrant to form a basic tree structure.
4 Assignments
0 Petitions
Accused Products
Abstract
A system and method to take data, which is in the form of an n-dimensional array of binary data where the binary data is comprised of bits that are identified by a bit position within the n-dimensional array, and create one file for each bit position of the binary data while maintaining the bit position identification and to store the bit with the corresponding bit position identification from the binary data within the created filed. Once this bit-sequential format of the data is achieved, the formatted data is structured into a tree format that is data-mining-ready. The formatted data is structured by dividing each of the files containing the binary data into quadrants according to the bit position identification and recording the count of 1-bits for each quadrant on a first level. Then, recursively dividing each of the quadrants into further quadrants and recording the count of 1-bits for each quadrant until all quadrants comprise a pure-1 quadrant or a pure-0 quadrant to form a basic tree structure.
67 Citations
36 Claims
-
1. A method of structuring data in a data-mining-ready format, wherein said data has been previously organized in a bit-Sequential (bSQ) format that comprises a plurality of binary files identified by a bit position, said method comprising the steps of:
-
dividing each of said plurality of binary files into first quadrants;
recording the count of 1-bits for each first quadrant on a first level;
dividing each of said first quadrants into new quadrants;
recording the count of 1-bits for each of said new quadrants on a new level; and
repeating the two steps immediately above until all of said new quadrants comprise a pure-1 quadrant or a pure-0 quadrant to form a basic tree structure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30, 36)
-
-
13. A system for structuring data in a data-mining-ready format, wherein said data has been previously organized in a bit-Sequential (bSQ) format that comprises a plurality of binary files identified by a bit position, said system comprising:
a computer system and a set of computer readable instructions, wherein said set of instructions include directing said computer to system to;
divide each of said plurality of binary files into first quadrants;
record the count of 1-bits for each first quadrant on a first level;
divide each of said first quadrants into new quadrants;
record the count of 1-bits for each of said new quadrants on a new level; and
repeat recursively until all of said new quadrants comprise a pure-1 or pure-0 quadrant to form a basic tree structure.
-
25. A system for formatting data, wherein said data is in the form of an n-dimensional array of binary data, said binary data comprising a plurality of bits that are identified by a bit position within the n-dimensional array, the system comprising:
a computer system and a set of computer readable instructions, wherein said set of instructions include directing said computer to system to;
create one file for each bit position of said binary data wherein the bit position identification is maintained; and
store the bit with the corresponding bit position identification from said binary data within the created file.
-
31. A method of formatting data, wherein said data is in the form of an n-dimensional array of binary data, said binary data comprising a plurality of bits that are identified by a bit position within the n-dimensional array, said method comprising the steps of
creating one file for each bit position of said binary data while maintaining the bit position identification; - and
storing the bit with the corresponding bit position identification from said binary data within the created file. - View Dependent Claims (32, 33, 34, 35)
- and
Specification