LARGE-SCALE ITEM AFFINITY DETERMINATION USING A MAP REDUCE PLATFORM
First Claim
1. A computer-implemented method of determining pair-wise item affinity based on transaction records tangibly embodied in at least one computer-readable medium, each transaction record including an indication of a bucket and an indication of a item transacted corresponding to that bucket, the method comprising:
- executing computer code by at least one computing device of a computing system to determine, for each partition, a total number of potential item pairs for that partition and a total count of unique items for that partition;
executing computer code by at least one computing device of the computing system to perform an item count, comprising;
determining, for each item, a count of the number of appearances of each item in all the buckets collectively;
for each item, encoding that item based at least in part on the determined item distribution across partitions;
executing computer code by at least one computing device of the computing system to perform a bucket materialization, comprising;
for each bucket, collecting into one record all item codes for items transacted in correspondence with that bucket;
for each bucket, processing the one record for that bucket to determine a number of item pairs that can be generated for that bucket and encoding that bucket based at least in part on the determined pair distribution across partitions;
executing computer code by at least one computing device of the computing system to perform a pair count and affinity/lift calculation, comprising;
generating pairs of item codes, and generating affinity statistics based on generated pairs of item codes; and
causing the generated pairs of item codes an affinity statistics to be stored in a tangible computer-readable medium.
4 Assignments
0 Petitions
Accused Products
Abstract
Pair-wise item affinity is based on transaction records. Each transaction record includes an indication of a bucket and an indication of an item transacted corresponding to that bucket. The method comprises a Phase 1 bucket filtering, Phase 2 item count, Phase 3 bucket materialization and Phase 4 pair count and affinity lift/calculation. The phases are ideally suited to be carried out by a computing system in a map-reduce configuration.
14 Citations
24 Claims
-
1. A computer-implemented method of determining pair-wise item affinity based on transaction records tangibly embodied in at least one computer-readable medium, each transaction record including an indication of a bucket and an indication of a item transacted corresponding to that bucket, the method comprising:
-
executing computer code by at least one computing device of a computing system to determine, for each partition, a total number of potential item pairs for that partition and a total count of unique items for that partition; executing computer code by at least one computing device of the computing system to perform an item count, comprising; determining, for each item, a count of the number of appearances of each item in all the buckets collectively; for each item, encoding that item based at least in part on the determined item distribution across partitions; executing computer code by at least one computing device of the computing system to perform a bucket materialization, comprising; for each bucket, collecting into one record all item codes for items transacted in correspondence with that bucket; for each bucket, processing the one record for that bucket to determine a number of item pairs that can be generated for that bucket and encoding that bucket based at least in part on the determined pair distribution across partitions; executing computer code by at least one computing device of the computing system to perform a pair count and affinity/lift calculation, comprising; generating pairs of item codes, and generating affinity statistics based on generated pairs of item codes; and causing the generated pairs of item codes an affinity statistics to be stored in a tangible computer-readable medium. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computing system configured to determine pair-wise item affinity based on transaction records tangibly embodied in at least one computer-readable medium, each transaction record including an indication of a bucket and an indication of an item transacted corresponding to that bucket, the computing system configured to:
-
execute computer code by at least one computing device of the computing system to determine, for each partition, a total number of potential item pairs for that partition and a total count of unique items for that partition; execute computer code by at least one computing device of the computing system to perform an item count, comprising; determining, for each item, a count of the number of appearances of each item in all the buckets collectively; for each item, encoding that item based at least in part on the determined item distribution across partitions; execute computer code by at least one computing device of the computing system to perform a bucket materialization, comprising; for each bucket, collecting into one record all item codes for items transacted in correspondence with that bucket; for each bucket, processing the one record for that bucket to determine a number of item pairs that can be generated for that bucket and encoding that bucket based at least in part on the determined pair distribution across partitions; and execute computer code by at least one computing device of the computing system to perform a pair count and affinity/lift calculation, comprising; generating pairs of item codes, and generating affinity statistics based on generated pairs of item codes; and causing the generated pairs of item codes an affinity statistics to be stored in a tangible computer-readable medium. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer-program product comprising at least one computer readable medium having computer-executable code tangibly embodied thereon, the computer-executable code to configure at least one computing device to:
-
determine, for each partition, a total number of potential item pairs for that partition and a total count of unique items for that partition; perform an item count, comprising; determining, for each item, a count of the number of appearances of each item in all the buckets collectively; for each item, encoding that item based at least in part on the determined item distribution across partitions; perform a bucket materialization, comprising; for each bucket, collecting into one record all item codes for items transacted in correspondence with that bucket; for each bucket, processing the one record for that bucket to determine a number of item pairs that can be generated for that bucket and encoding that bucket based at least in part on the determined pair distribution across partitions; and perform a pair count and affinity/lift calculation, comprising; generating pairs of item codes, and generating affinity statistics based on generated pairs of item codes; and causing the generated pairs of item codes an affinity statistics to be stored in a tangible computer-readable medium. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification