Scaling machine learning using approximate counting
First Claim
Patent Images
1. A method performed by one or more of a plurality of computer devices, the method comprising:
- identifying, by a first computer device of the plurality of computer devices, a feature set that includes at least a first feature and a second feature;
storing, by the first computer device and in a plurality of memory locations in a memory, values relating to the first feature;
subjecting, by the first computer device, a string, associated with the first feature, to multiple, different hash functions to generate multiple, different hash values;
identifying, by the first computer device, for each of the multiple, different hash values, a respective memory location, of the plurality of memory locations in the memory;
reading, by the first computer device, the values stored at the respective memory locations;
performing, by the first computer device, an operation on the read values from the respective memory locations to obtain updated values;
writing, by the first computer device, the updated values into the respective memory locations;
sending, by the first computer device, a request to a second computer device, of the plurality of computer devices, for information regarding the second feature;
receiving, by the first computer device, the information from the second computer device; and
using, by the first computer device, the updated values and the received information to make a prediction regarding particular data.
1 Assignment
0 Petitions
Accused Products
Abstract
A system may track statistics for a number of features using an approximate counting technique by: subjecting each feature to multiple, different hash functions to generate multiple, different hash values, where each of the hash values may identify a particular location in a memory, and storing statistics for each feature at the particular locations identified by the hash values. The system may generate rules for a model based on the tracked statistics.
-
Citations
20 Claims
-
1. A method performed by one or more of a plurality of computer devices, the method comprising:
-
identifying, by a first computer device of the plurality of computer devices, a feature set that includes at least a first feature and a second feature; storing, by the first computer device and in a plurality of memory locations in a memory, values relating to the first feature; subjecting, by the first computer device, a string, associated with the first feature, to multiple, different hash functions to generate multiple, different hash values; identifying, by the first computer device, for each of the multiple, different hash values, a respective memory location, of the plurality of memory locations in the memory; reading, by the first computer device, the values stored at the respective memory locations; performing, by the first computer device, an operation on the read values from the respective memory locations to obtain updated values; writing, by the first computer device, the updated values into the respective memory locations; sending, by the first computer device, a request to a second computer device, of the plurality of computer devices, for information regarding the second feature; receiving, by the first computer device, the information from the second computer device; and using, by the first computer device, the updated values and the received information to make a prediction regarding particular data. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
a computer device to; identify a feature set that includes at least a first feature and a second feature; store, in a plurality of memory locations in a memory, values relating to the first feature; subject a string, associated with the first feature, to multiple, different hash functions to generate multiple, different hash values; identify, for each of the multiple, different hash values, a respective memory location, of the plurality of memory locations in the memory; read the values stored at the respective memory locations; perform an operation on the read values from the respective memory locations to obtain updated values; write the updated values into the respective memory locations; send a request to a second computer device for information regarding the second feature; receive the information from the second computer device, and use the updated values and the received information to make a prediction regarding particular data. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
15. A non-transitory computer-readable medium comprising:
-
one or more instructions which, when executed by at least one processor, cause the at least one processor to identify a feature set that includes at least a first feature and a second feature; one or more instructions which, when executed by the at least one processor, cause the at least one processor to store, in a plurality of memory locations in a memory, values relating to the first feature; one or more instructions which, when executed by the at least one processor, cause the at least one processor to subject a string, associated with the first feature, to multiple, different hash functions to generate multiple, different hash values; one or more instructions which, when executed by the at least one processor, cause the at least one processor to identify, for each of the multiple, different hash values, a respective memory location, of the plurality of memory locations in the memory; one or more instructions which, when executed by the at least one processor, cause the at least one processor to read the values stored at the respective memory locations; one or more instructions which, when executed by the at least one processor, cause the at least one processor to perform an operation on the read values from the respective memory locations to obtain updated values; one or more instructions which, when executed by the at least one processor, cause the at least one processor to write the updated values into the respective memory locations; one or more instructions which, when executed by the at least one processor, cause the at least one processor to send a request to a computer device for information regarding the second feature; one or more instructions which, when executed by the at least one processor, cause the at least one processor to receive the information from the computer device, and one or more instructions which, when executed by the at least one processor, cause the at least one processor to use the updated values and the received information to make a prediction regarding particular data. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification