Method and system for generating analogous fictional data from non-fictional data
First Claim
1. A computer-implemented method for generating a second data set from a first data set comprising:
- determining an occurrence value for each element of said first data set pertaining to the quantity of occurrences of that element within said first data set;
receiving a similarity value configured by a user indicating a desired similarity between said first and second data sets; and
generating said second data set to include elements with said occurrence values for said second data set based on said occurrence values of elements of said first data set and said similarity value, wherein said generating said second data set further includes;
examining said occurrence value of an element in said first data set;
retrieving another element of said first data set with an occurrence value within a range from said occurrence value of said examined element in said first data set, wherein said range is indicated by said similarity value to provide said desired similarity; and
substituting said examined element with said retrieved element by placing said retrieved element at a location within said second data set that corresponds to a location of said examined element in said first data set.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and system for generating analogous fictional data from non-fictional data, is provided. One implementation involves recording non-fictional data, scoring the non-fictional data in terms of occurrence percentile, obtaining a set of user-configurations that represents a likeness range between non-fictional data and corresponding fictional data, based on the scores and the user-configurations, generating analogous fictional data from the non-fictional data, and comparing hash values for the fictional data with hash values for the non-fictional data to determine matches, and in case of matches, generating analogous fictional data from the non-fictional data based on the scores and incrementally lowered likeness range, whereby entire records of fictional data are generated based on entire records of non-fictional data, wherein the fictional data is consistent with the non-fictional data.
26 Citations
21 Claims
-
1. A computer-implemented method for generating a second data set from a first data set comprising:
-
determining an occurrence value for each element of said first data set pertaining to the quantity of occurrences of that element within said first data set; receiving a similarity value configured by a user indicating a desired similarity between said first and second data sets; and generating said second data set to include elements with said occurrence values for said second data set based on said occurrence values of elements of said first data set and said similarity value, wherein said generating said second data set further includes; examining said occurrence value of an element in said first data set; retrieving another element of said first data set with an occurrence value within a range from said occurrence value of said examined element in said first data set, wherein said range is indicated by said similarity value to provide said desired similarity; and substituting said examined element with said retrieved element by placing said retrieved element at a location within said second data set that corresponds to a location of said examined element in said first data set. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer system for generating a second data set from a first data set comprising:
-
at least one storage system for storing a computer program; and at least one processor for processing said computer program to; determine an occurrence value for each element of said first data set pertaining to the quantity of occurrences of that element within said first data set; receive a similarity value configured by a user indicating a desired similarity between said first and second data sets; and generate said second data set to include elements with said occurrence values for said second data set based on said occurrence values of elements of said first data set and said similarity value, wherein said generating said second data set further includes; examining said occurrence value of an element in said first data set; retrieving another element of said first data set with an occurrence value within a range from said occurrence value of said examined element in said first data set, wherein said range is indicated by said similarity value to provide said desired similarity; and substituting said examined element with said retrieved element by placing said retrieved element at a location within said second data set that corresponds to a location of said examined element in said first data set. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product for generating a second data set from a first data set, the computer program product comprising:
a computer useable storage medium having computer readable program code embodied therewith, the computer readable program code comprising computer readable program code to; determine an occurrence value for each element of said first data set pertaining to the quantity of occurrences of that element within said first data set; receive a similarity value configured by a user indicating a desired similarity between said first and second data sets; and generate said second data set to include elements with said occurrence values for said second data set based on said occurrence values of elements of said first data set and said similarity value, wherein said generating said second data set further includes; examining said occurrence value of an element in said first data set; retrieving another element of said first data set with an occurrence value within a range from said occurrence value of said examined element in said first data set, wherein said range is indicated by said similarity value to provide said desired similarity; and substituting said examined element with said retrieved element by placing said retrieved element at a location within said second data set that corresponds to a location of said examined element in said first data set. - View Dependent Claims (16, 17, 18, 19, 20, 21)
Specification