GENERATING ANONYMOUS DATA FROM WEB DATA
First Claim
1. A method, comprising:
- receiving, by a device, web data associated with user devices,the web data being generated based on interactions of the user devices with a network and one or more content provider devices;
removing, by the device, erroneous or objectionable web data from the web data to generate a subset of the web data;
categorizing, by the device, the subset of the web data by assigning categories to the subset of the web data;
performing, by the device, an empirical estimation of the categorized subset of the web data to generate empirical estimations;
performing, by the device, a simulation of the empirical estimations to generate synthetic data that corresponds to the web data and removes private information relating to the user devices and users of the user devices; and
storing, by the device, the synthetic data in a storage device.
1 Assignment
0 Petitions
Accused Products
Abstract
A device receives web data, associated with user devices, that is generated based on interactions of the user devices with a network and one or more content provider devices. The device removes erroneous or objectionable web data from the web data to generate a subset of the web data, and categorizes the subset of the web data by assigning categories to the subset of the web data. The device performs an empirical estimation of the categorized subset of the web data to generate empirical estimations. The device performs a simulation of the empirical estimations to generate synthetic data that corresponds to the web data and removes private information relating to the user devices and users of the user devices, and stores the synthetic data in a storage device.
-
Citations
20 Claims
-
1. A method, comprising:
-
receiving, by a device, web data associated with user devices, the web data being generated based on interactions of the user devices with a network and one or more content provider devices; removing, by the device, erroneous or objectionable web data from the web data to generate a subset of the web data; categorizing, by the device, the subset of the web data by assigning categories to the subset of the web data; performing, by the device, an empirical estimation of the categorized subset of the web data to generate empirical estimations; performing, by the device, a simulation of the empirical estimations to generate synthetic data that corresponds to the web data and removes private information relating to the user devices and users of the user devices; and storing, by the device, the synthetic data in a storage device. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A device, comprising:
one or more processors to; receive web data associated with user devices, the web data being generated based on interactions of the user devices with a network and a plurality of content provider devices; remove erroneous or objectionable web data from the web data to generate a subset of the web data; categorize the subset of the web data by assigning categories to the subset of the web data; perform an empirical estimation of the categorized subset of the web data to generate empirical estimations; perform a simulation of the empirical estimations to generate synthetic data that corresponds to the web data and removes private information relating to the user devices and users of the user devices; store the synthetic data in a storage device; and output the synthetic data. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
15. A computer-readable medium for storing instructions, the instructions comprising:
one or more instructions that, when executed by one or more processors of a device, cause the one or more processors to; receive web data associated with user devices, the web data being generated based on interactions of the user devices with a network and one or more content provider devices, the web data including private information regarding the user devices and/or users of the user devices; remove erroneous or objectionable web data from the web data to generate a subset of the web data; categorize the subset of the web data by assigning categories to the subset of the web data; perform an empirical estimation of the categorized subset of the web data to generate empirical estimations; perform a simulation of the empirical estimations to generate synthetic data that statistically corresponds to the web data and removes the private information from the web data; and store the synthetic data in a storage device. - View Dependent Claims (16, 17, 18, 19, 20)
Specification