Association rule generation and group-by processing system
First Claim
1. A group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, comprising:
- record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to a hash function value calculated using the key value of the pointed record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to said storage device, given the hash function values for storage positions of the pointers; and
group-by operation execution means for reading a list of hashed records output to said storage device by said output means, sorting the hashed records in the list according to the key value, and performing the group-by operation on the list of the sorted records.
1 Assignment
0 Petitions
Accused Products
Abstract
A group-by processing system performs a specified operation computing average value, etc. on a group of records having the same key value, efficiently accesses a secondary storage device, and realizes a high speed process. The group-by processing system includes a unit for storing records; a unit for storing pointers to the records in the unit for storing records at positions, each of which corresponds to a hash function value calculated using the key value of the pointed record; a unit for outputting the records pointed to by the pointers in the unit for storing pointers to secondary storage device, given the hash function values; and a unit for reading a list of output hashed records, sorting the records by key value, and performing a group-by operation on the sorted records. The system also includes a combination generation unit for outputting a combination of two or more items satisfying a combination generation restriction condition; an occurrence count unit for counting occurrences of the combinations; a combination selection unit for selecting the combination which satisfies the given condition of the number of occurrences; and a restriction condition generation unit for assigning the restriction condition to the combination generation unit according to the selection result. The combinations of items can be counted efficiently by performing a high-speed group-by operation.
56 Citations
55 Claims
-
1. A group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, comprising:
-
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to a hash function value calculated using the key value of the pointed record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to said storage device, given the hash function values for storage positions of the pointers; and
group-by operation execution means for reading a list of hashed records output to said storage device by said output means, sorting the hashed records in the list according to the key value, and performing the group-by operation on the list of the sorted records. - View Dependent Claims (2, 3)
auxiliary information storing means for storing auxiliary information for retrieving records in the block according to the hash function value.
-
-
3. The system according to claim 2, wherein said group-by operation execution means performs the group-by operation on the records using a sequence of hashed records corresponding to the hash function values and output as a set of blocks by said output means and auxiliary information stored in said auxiliary information storing means.
-
4. A hash system in a group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, comprising:
-
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to a hash function value calculated using the key value of the record; and
output means for outputting the records pointed to by the pointers stored in said pointer storing means to sequential space or non-sequential space in said storage device, given the hash function values for storage positions of the pointers. - View Dependent Claims (5)
auxiliary information storing means for storing auxiliary information for retrieving records in the block according to the hash function value.
-
-
6. A group-by processing system for performing a group-by operation, comprising:
- means for transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records corresponding to the hash function values stored in sequential space or non-sequential space in said storage device and auxiliary information for retrieving the records in the block according to the hash function value.
- means for transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record; and
-
7. A group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, comprising:
-
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to a hash function value calculated using the key value of the pointed record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to sequential space or non-sequential space in said storage device as a set of blocks, each of which is output to sequential space, given the hash function values for storage positions of the pointers;
auxiliary information storing means for storing auxiliary information for retrieving records in the block according to the hash function value; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records output by said output means corresponding to the hash function values and auxiliary information stored in said auxiliary information storing means.
-
-
8. A hash system in a group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, comprising:
-
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to a hash function value calculated using the key value of the pointed record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to said storage device, given the hash function values for storage positions of the pointers; and
run information storing means for storing run information including addresses, each of which points to a record contained in each run created by said output means in said storage device, given the hash function values from a minimum value to a maximum value. - View Dependent Claims (9)
-
-
10. A group-by processing system for performing a group-by operation, comprising:
- means for transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records obtained as a result of the hash process in said storage device and run information, obtained as a result of the same hash process, including addresses, each of which points to a record contained in each run created by output means in said storage device, given the hash function values from a minimum value to a maximum value.
- means for transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record; and
-
11. A group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, comprising:
-
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to a hash function value calculated using the key value of the record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to said storage device, given the hash function values for storage positions of the pointers;
run information storing means for storing run information including addresses, each of which points to a record contained in each run created by said output means in said storage device, given the hash function values from a minimum value to a maximum value; and
group-by operation execution means for performing the group-by process using a sequence of hashed records output by said output means and stored in said storage device and the run information stored in said run information storing means.
-
-
12. A group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, comprising:
-
record storing means for temporarily storing the records;
pointer storing means for storing pointers to records in said record storing means at positions, each of which corresponds to a hash function value calculated using the key value of the record;
output means for outputting the records pointed to by pointers stored in said pointer storing means to said storage device, given the hash function values for storage positions of the pointers, outputting the records contained in the same run to sequential space in said storage device, given the hash function values from a minimum value to a maximum value, and outputting the records of different runs to sequential space or non-sequential space in said storage device;
run information storing means for storing run information including addresses, each of which points to a record contained in each run in said storage device; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records output by said output means and stored in said storage device and the run information stored in said run information storing means.
-
-
13. A hash system in a group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, comprising:
-
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to a hash function value calculated using the key value of the record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to sequential space or non-sequential space in said storage device as a set of blocks, and outputting the blocks with connection data between two blocks if the two blocks should be adjacent with each other and if the blocks cannot be output to sequential space in said storage device, given the hash function values for storage positions of the pointers; and
run information storing means for storing run information including addresses, each of which points to a record contained in each run created by said output means in said storage device, given the hash function values from a minimum value to a maximum value.
-
-
14. A group-by processing system for performing a group-by operation, comprising;
- means for transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records stored in said storage device, run information including addresses, each of which points to a record contained in each run created by output means in said storage device, given the hash function values from a minimum value to a maximum value, and connection data between two blocks which are stored in independent and non-sequential space in said storage device though the blocks should come sequentially.
- means for transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record; and
-
15. A group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, comprising:
-
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to a hash function value calculated using the key value of the record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to sequential space or non-sequential space in said storage device, and outputting the blocks with connection data between two blocks if the two blocks should be adjacent with each other and if the blocks cannot be output to sequential space in said storage device, given the hash function values for storage positions of the pointers;
run information storing means for storing run information including addresses, each of which points to a record contained in each run created by said output means in said storage device, given the hash function values from a minimum value to a maximum value; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records output by said output means and stored in said storage device, given the hash function values, the run information stored in said run information storing means, and the connection data between the two blocks.
-
-
16. A computer readable storage medium storing a process, for a group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, the process comprises the steps of:
-
temporarily storing the records;
storing pointers to the temporarily stored records at positions, each of which corresponds to a hash function value calculated using the key value of the pointed record;
outputting the records pointed to by the pointers to said storage device, given the hash function values for storage positions of the pointers; and
reading a list of hashed records output to said storage device, sorting the hashed records in the list according to the key value, and performing the group-by operation on the list of the sorted records.
-
-
17. A computer readable storage medium storing a process, for a group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, the process comprises the steps of:
-
temporarily storing the records;
storing pointers to the temporarily stored records at positions, each of which corresponds to a hash function value calculated using the key value of the pointed record;
outputting the records pointed to by the pointers to sequential space or non-sequential space in said storage device as a set of blocks, each of which is output to sequential space in said storage device;
storing auxiliary information for retrieving records in the block according to the hash function value; and
performing the group-by operation using a sequence of hashed records output to said storage device and the auxiliary information.
-
-
18. A computer readable storage medium storing a process, for a group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, the process comprises the steps of:
-
temporarily storing the records;
storing pointers to temporarily stored records at positions, each of which corresponds to a hash function value calculated using the key value of the pointed record;
outputting the records pointed to by the pointers in said storage device, given the hash function values for storage positions of the pointers;
storing run information including addresses, each of which points to a record contained in each run created in said storage device, given the hash function values from a minimum value to a maximum value; and
performing the group-by operation using a sequence of hashed records stored in said storage device and the run information stored in said run information storing means.
-
-
19. A computer readable storage medium storing a process, for a group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, the process comprises the steps of:
-
temporarily storing the records;
storing pointers to the temporarily stored records at positions, each of which corresponds to a hash function value calculated using the key value of the pointed record;
outputting the records pointed to by the pointers to said storage device, given the hash function values for storage positions of the pointers, outputting the records contained in the same run to sequential space in said storage device, given the hash function values from a minimum value to a maximum value, and outputting the records of different runs to sequential space or non-sequential space in said storage device;
storing run information including addresses, each of which points to a record contained in each run in said storage device; and
performing the group-by process using a sequence of hashed records output and stored in said storage device and the run information.
-
-
20. A computer readable storage medium storing a process, for a group-by processing system for performing a group-by operation based on a hash method of transforming records to be hashed stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record, the process comprises the steps of:
-
temporarily storing the records;
storing pointers to the temporarily stored records at positions, each of which corresponds to a hash function value calculated using the key value of the record;
outputting the records pointed to by the pointers to said storage device as a set of blocks, each of which is output to sequential space in said storage device, given the hash function values for storage positions of the pointers, and for outputting the blocks with connection data between two blocks if the two blocks should be adjacent with each other and if the blocks cannot be output to sequential space in said storage device;
storing run information including addresses, each of which points to a record contained in each run created in said storage device, given the hash function values from a minimum value to a maximum value; and
performing the group-by process using a sequence of hashed records stored in said storage device, the run information, and the connection data between the two blocks.
-
-
21. A related data combination count system for obtaining individual items or combinations of two or more items from a plurality of transactions containing one or more items, and determining the number of occurrences of the individual items or the combinations of two or more items which satisfy a given condition of the number of occurrences in the transactions, comprising:
-
combination generation means for outputting each item in each transaction when the individual item is to be obtained, and for generating every combination of the number of items to be obtained from each transaction and outputting only the combination satisfying a combination generation restriction condition that partial combinations in the combination satisfy the given condition of the number of occurrences, said partial combination comprising the number of items less than the number of items to be obtained, when the combination of two or more items is to be obtained;
occurrence count means for counting occurrences, in all transactions, of the individual item or the combination of two or more items output by said combination generation means;
combination selection means for selecting the individual item or the combination of two or more items which satisfies the given condition of the number of occurrences; and
restriction condition generation means for assigning the combination generation restriction condition to said combination generation means according to the individual item or the combination of two or more items selected by said combination selection means. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40)
said combination generation means generates only the combination of items, in which partial combinations, each of which comprises the number of items less than the number of items to be obtained, correspond to one of the individual items or the combinations of items associated with the bit positions containing the specified bit value in the bit map when said combination generation means generates the combination of two or more items.
-
-
23. The system according to claim 22, wherein said restriction condition generation means sets the specified bit value at the bit position in the bit map corresponding to a hash function value for the individual item or the combination of two or more items selected by said combination selection means.
-
24. The system according to claim 21, wherein said combination generation means outputs, in addition to the items contained in the transactions, a combination of items or an individual item contained in another level in a hierarchical structure of items, excepting ancestor items in the hierarchical structure of items.
-
25. The system according to claim 21, wherein said combination generation means generates a combination of items from a sequence of items in time-series contained in the transactions with keeping the order of the sequence.
-
26. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises;
record storing means for temporarily storing the records;
pointer storing means pointers to the records in said record storing means at positions, each of which corresponds to the hash function value calculated using the key value of the pointed record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to said storage device, given the hash function values for storage positions of the pointers; and
group-by operation execution means for reading a list of hashed records output to said storage device by said output means, sorting the hashed records in the list according to the key value, and performing the group-by operation on the list of the sorted records.
- and
-
27. The system according to claim 26, wherein said output means outputs the records output corresponding to the hash function values from said record storing means to said storage device as a set of blocks, and comprises:
auxiliary information storing means for storing auxiliary information for retrieving records in the block according to the hash function value.
-
28. The system according to claim 27, wherein said group-by operation execution means performs the group-by process on the records using a sequence of hashed records corresponding to the hash function values and output as a set of blocks by said output means and using auxiliary information stored in said auxiliary information storing means.
-
29. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises in a hash processing system;
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to the hash function value calculated using the key value of the record; and
output means for outputting the records pointed to by the pointers stored in said pointer storing means to sequential space or non-sequential space in said storage device, given the hash function values for storage positions of the pointers.
- and
-
30. The system according to claim 29, wherein said output means outputs the records output corresponding to the hash function values from said record storing means to said storage device as a set of blocks, and comprises:
auxiliary information storing means for storing auxiliary information for retrieving records in the block according to the hash function value.
-
31. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises;
group-by operation execution means for performing the group-by operation using a sequence of hashed records corresponding to the hash function values stored in sequential space or non-sequential space in said storage device and auxiliary information for retrieving the records in the block according to the hash function value.
- and
-
32. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises;
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to the hash function value calculated using the key value of the record; and
output means for outputting the records pointed to by the pointers stored in said pointer storing means to sequential space or non-sequential space in said storage device as a set of blocks, each of which is output to sequential space, given the hash function values for storage positions of the pointers;
auxiliary information storing means for storing auxiliary information for retrieving records in the block according to the hash function value; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records output by said output means corresponding to the hash function values and the auxiliary information stored in said auxiliary information storing means.
- and
-
33. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises in a hash processing system;
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to the hash function value calculated using the key value of the pointed record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to said storage device, given the hash function values for storage positions of the pointers; and
run information storing means for storing run information including addresses, each of which points to a record contained in each run created by said output means in said storage device, given the hash function values from a minimum value to a maximum value.
- and
-
34. The hash system according to claim 33, wherein said output means outputs the records contained in the same run to sequential space in said storage device, and stores the records of different runs to sequential space or non-sequential space in said storage device.
-
35. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises;
group-by operation execution means for performing the group-by operation using a sequence of hashed records obtained as a result of the same hash process in said storage device and run information, obtained as a result of the same hash process, including addresses, each of which points to a record contained in each run created by output means in said storage device, given the hash function values from a minimum value to a maximum value.
- and
-
36. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises;
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to the hash function value calculated using the key value of the record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to said storage device, given the hash function values for storage positions of the pointers;
run information storing means for storing run information including addresses, each of which points to a record contained in each run created by said output means in said storage device, given the hash function values from a minimum value to a maximum value; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records output by said output means and stored in said storage device and the run information stored in said run information storing means.
- and
-
37. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises;
record storing means for temporarily storing the records;
pointer storing means for storing pointers to records in said record storing means at positions, each of which corresponds to the hash function value calculated using the key value of the record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to said storage device, given the hash function values for storage positions of the pointers, outputting the records contained in the same run to sequential space in said storage device, given the hash function values from a minimum value to a maximum value, and outputting the records of different runs to sequential space or non-sequential space in said storage device;
run information storing means for storing run information including addresses, each of which points to a record contained in each run in said storage device; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records output by said output means and stored in said storage device and the run information stored in said run information storing means.
- and
-
38. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises in a hash processing system;
record storing means for temporarily storing the records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to the hash function value calculated using the key value of the record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to sequential space or non-sequential space in said storage device as a set of blocks, and outputting the blocks with connection data between two blocks if the two blocks should be adjacent with each other and if the blocks cannot be output to sequential space in said storage device, given the hash function values for storage positions of the pointers; and
run information storing means for storing run information including addresses, each of which points to a record contained in each run created by said output means in said storage device, given the hash function values from a minimum value to a maximum value.
- and
-
39. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises;
group-by operation execution means for performing the group-by operation using a sequence of hashed records stored in said storage device, run information including addresses, each of which points to a record contained in each run created by output means in said storage device, given the hash function values from a minimum value to a maximum value, and connection data between two blocks which are stored in independent and non-sequential space in said storage device though the blocks should come sequentially.
- and
-
40. The system according to claim 21, wherein said occurrence count means performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said system further comprises;
record storing means for temporarily storing records;
pointer storing means for storing pointers to the records in said record storing means at positions, each of which corresponds to the hash function value calculated using the key value of the record;
output means for outputting the records pointed to by the pointers stored in said pointer storing means to sequential space or non-sequential space in said storage device, and outputting the blocks with connection data between two blocks if the two blocks should be adjacent with each other and if the blocks cannot be output to sequential space in said storage device, given the hash function values for storage positions of the pointers;
run information storing means for storing run information including addresses, each of which points to a record contained in each run created by said output means in said storage device, given the hash function values from a minimum value to a maximum value; and
group-by operation execution means for performing the group-by operation using a sequence of hashed records output by said output means and stored in said storage device, given the hash function values, the run information stored in said run information storing means, and the connection data between the two blocks.
- and
-
41. A computer readable storage medium storing a process, in a related data combination count system for obtaining individual items or combinations of two or more items from a plurality of transactions containing one or more items, and determining the number of occurrences of the individual item or the combination of two or more items which satisfy a given condition of the number of occurrences in the transactions, the process comprises the steps of:
-
outputting each item in each transaction when the individual item is to be obtained, and generating every combination of the number of items to be obtained from each transaction and outputting only the combination satisfying a combination generation restriction condition that partial combinations in the combination satisfy the given condition of the number of occurrences, said partial combination comprising the number of items less than the number of items to be obtained, when the combination of two or more items is to be obtained;
counting occurrences, in all transactions, of the individual item or the combination of two or more items output by said outputting function;
selecting the individual item or the combination of two or more items which satisfies the given condition of the number of occurrences; and
assigning the combination generation restriction condition to said outputting function according to the individual item or the combination of two or more items selected by said selecting function. - View Dependent Claims (42, 43, 44, 45, 46)
said process further comprises the steps of;
temporarily storing the records;
storing pointers to the temporarily stored records at positions, each of which corresponds to the hash function value calculated using the key value of the pointed record;
outputting the records pointed to by the pointers to said storage device, given the hash function values for storage positions of the pointers; and
reading a list of hashed records output to said storage device, sorting the hashed records in the list according to the key value, and performing the group-by operation on the list of the sorted records.
-
-
43. The storage medium according to claim 41, wherein said step of counting occurrences performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said process further comprises the steps of;
temporarily storing the records;
storing pointers to the temporarily stored records at positions, each of which corresponds to the hash function value calculated using the key value of the pointed record; and
outputting the records pointed to by the pointers to sequential space or non-sequential space in said storage device as a set of blocks, each of which is output to sequential space in said storage device;
storing auxiliary information for retrieving records in the block according to the hash function value; and
performing the group-by operation using a sequence of hashed records output to said storage device and the auxiliary information.
- and
-
44. The storage medium according to claim 41, wherein said step of counting occurrences performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said process further comprises the steps of;
temporarily storing the records;
storing pointers to the temporarily stored records at positions, each of which corresponds to the hash function value calculated using the key value of the pointed record;
outputting the records pointed to by the pointers in said storage device, given the hash function values for storage positions of the pointers;
storing run information including addresses, each of which points to a record contained in each run created in said storage device, given the hash function values from a minimum value to a maximum value; and
performing the group-by operation using a sequence of hashed records stored in said storage device and the run information stored in said run information storing means.
- and
-
45. The storage medium according to claim 41, wherein said step of counting occurrences performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said process further comprises the step of;
temporarily storing the records;
storing pointers to the temporarily stored records at positions, each of which corresponds to the hash function value calculated using the key value of the record;
outputting the records pointed to by the pointers to said storage device, given the hash function values for storage positions of the pointers, outputting the records contained in the same run to sequential space in said storage device, given the hash function values from a minimum value to a maximum value, and outputting the records of different runs to sequential space or non-sequential space in said storage device;
storing run information including addresses, each of which points to a record contained in each run in said storage device; and
performing the group-by operation using a sequence of hashed records output and stored in said storage device and the run information.
- and
-
46. The storage medium according to claim 41, wherein said step of counting occurrences performs a group-by operation to count the occurrences of the items based on a hash process, which transforms records including the individual items or the combination of two or more items stored in a storage device into a referenceable storage form using hash function values, each of which corresponds to a key value of the record;
- and
said process further comprises the steps of;
temporarily storing records;
storing pointers to the temporarily stored records at positions, each of which corresponds to the hash function value calculated using the key value of the record;
outputting the records pointed to by the pointers to said storage device as a set of blocks, each of which is output to sequential space in said storage device, given the hash function values for storage positions of the pointers, and for outputting the blocks with connection data between two blocks if the two blocks should be adjacent with each other and if the blocks cannot be output to sequential space in said storage device;
storing run information including addresses, each of which points to a record contained in each run created in said storage device, given the hash function values from a minimum value to a maximum value; and
performing the group-by process using a sequence of hashed records stored in said storage device, the run information, and the connection data between the two blocks.
- and
-
47. A method of counting related data combinations for obtaining individual items or combinations of two or more items from a plurality of transactions containing one or more items, and determining the number of occurrences of the individual items or the combinations of two or more items which satisfy a given condition of the number of occurrences in the transactions, comprising the steps of:
-
outputting each item in each transaction when the individual item is to be obtained, and generating every combination of the number of items to be obtained from each transaction and outputting only the combination satisfying a combination generation restriction condition that partial combinations in the combination satisfy the given condition of the number of occurrences, said partial combination comprising the number of items less than the number of items to be obtained, when the combination of two or more items is to be obtained;
counting occurrences, in all transactions, of the individual item or the combination of two or more items output by said outputting step;
selecting the individual item or the combination of two or more items which satisfies the given condition of the number of occurrences; and
assigning the combination generation restriction condition to said outputting step according to the individual item or the combination of two or more items selected by said selecting step. - View Dependent Claims (48, 49, 50, 51)
creating, for each number of items to be obtained, a bit map with specified bit value at positions corresponding to the individual items or the combinations of two or more items selected by said selecting step, and said outputting step further comprises the step of;
using at least one bit map for the number of items less than the number of items to be obtained, and determining that the partial combination in the combination corresponds to the individual items or combinations of two or more items associated with the positions with specified bit value in the bit map.
-
-
49. The method according to claim 48, wherein each position in the bit map corresponds to a hash function value, calculated using identification numbers for individual items or for the combinations of two or more items.
-
50. The method according to claim 47, wherein, in addition to the items contained in the transactions, individual items or combinations of two or more items contained in another level in a hierarchical structure of items, excepting ancestor items in the hierarchical structure of items, and the number of occurrences of the individual items or the combinations of two or more items are obtained.
-
51. The method according to claim 47, wherein individual items or combinations of two or more items satisfying the given condition and the number of occurrences of the individual items or the combinations of two or more items are obtained from a sequence of items in time-series contained in the transactions with keeping the original order of the sequence.
-
52. A method of counting related data combinations for obtaining individual items or combinations of two or more items from a plurality of transactions containing one or more items, and determining the number of occurrences of the individual items or the combinations of two or more items which satisfy a given condition of the number of occurrences in the transactions, comprising the steps of:
-
(a) counting individual items in each transaction, and obtaining the number of occurrences of each individual item in all transactions;
(b) selecting individual items having the number of occurrences which meets the given condition, and outputting the individual items and the number of occurrences;
(c) generating a bit map for the individual items with specified bit value set at bit positions, each of which corresponds to one of the output items;
(d) generating combinations of two items, from transactions of two or more items, including only items corresponding to the bit position having the specified bit value in the bit map;
(e) counting the combinations of two items in the transactions of two or more items, and obtaining the number of occurrences of each combination of two items;
(f) selecting the combinations of two items having the number of occurrences which meets the given condition, and outputting the selected combinations of two items and the number of occurrences;
(g) generating a bit map for the combinations of two items with specified bit value set at bit positions, each of which corresponds to one of the selected combination of two items, and updating the bit map for the individual items with specified bit value set at bit positions, each of which corresponds to one of the two items of the selected combination of two items;
(h) generating combinations of three items, from transactions of three or more items, including only one item or combination of two items corresponding to the bit positions having the specified bit value in the previous generated bit maps;
(i) counting the combinations of three items in the transactions of three or more items, and obtaining the number of occurrences of each combination of three items;
(j) selecting the combinations of three items having the number of occurrences which meets the given condition, and outputting the selected combinations of three items and the number of occurrences;
(k) generating a bit map for the combinations of three items with specified bit value set at bit positions, each of which corresponds to one of the selected combination of three items, updating the bit map for the combinations of two items with specified bit value set at bit positions, each of which corresponds to one of the three combinations of two items of the selected combination of three items, and updating the bit map for the individual items with specified bit value set at bit positions, each of which corresponds to one of the three items of the selected combination of three items; and
(l) repeating steps (h) through (k) for combinations of four or more items in the same manner. - View Dependent Claims (53, 54, 55)
-
Specification