IDENTITY INFORMATION DE-IDENTIFICATION DEVICE
First Claim
1. A personal information anonymization device, comprising:
- a personal information storing unit configured to store one or more personal information formed of an attribute value for every attribute;
a generalization hierarchy tree automatic generation unit configured to select one attribute and automatically configure a generalization hierarchy tree that represents a dominant concept of each attribute value which occurs in the input personal information for each attribute as a tree structure in accordance with a level of obfuscation using a frequency obtaining unit that counts the number of input personal information having the attribute value for every attribute value that occurs in the selected attribute; and
a unit configured to recode the input personal information using the generalization hierarchy tree generated for each attribute using the generalization hierarchy tree automatic generation unit.
1 Assignment
0 Petitions
Accused Products
Abstract
De-identification device for automatically configuring a general hierarchy tree of attribute values of identity information. The provided de-identification device quantitatively evaluates the amount of information which is lost when generalizing an attribute value, and can thereby automatically assess priorities between de-identified data and between data that is being de-identified. Information of each person includes attribute values of the person for a plurality of attributes. De-identification is achieved by obfuscating the attribute values, and a structure in which attribute values to be obfuscated are expressed in a tree structure according to the level of obfuscation is called a general hierarchy tree. The disclosed identity information de-identification device achieves automatic configuration by configuring a tree using frequency information of attribute values. By defining a lost information amount metric means, using the general hierarchy tree, in formation amount loss between two de-identified data or between data being de-identified is quantitively assessed.
-
Citations
19 Claims
-
1. A personal information anonymization device, comprising:
-
a personal information storing unit configured to store one or more personal information formed of an attribute value for every attribute; a generalization hierarchy tree automatic generation unit configured to select one attribute and automatically configure a generalization hierarchy tree that represents a dominant concept of each attribute value which occurs in the input personal information for each attribute as a tree structure in accordance with a level of obfuscation using a frequency obtaining unit that counts the number of input personal information having the attribute value for every attribute value that occurs in the selected attribute; and a unit configured to recode the input personal information using the generalization hierarchy tree generated for each attribute using the generalization hierarchy tree automatic generation unit. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
10. A personal information anonymization device, comprising:
-
using one or more personal information formed of attribute values for every attribute and a generalization hierarchy tree that represents a dominant concept of an attribute value which occurs in the one or more personal information for each attribute as a tree structure in accordance with a level of an obfuscation as an input, a lost information amount metric unit configured to calculate an amount of information lost at the time of obfuscating one attribute value of one personal information using the automatically generated generalization hierarchy tree; and a unit configured to recode the input personal information by obfuscating each attribute value of the input personal information to a node which is a grandparent of a node indicated by the attribute value using the lost information amount metric unit and the generalization hierarchy tree.
-
-
11. A personal information anonymization device, using a generalization hierarchy tree that stores a generalization hierarchy tree that represents a dominant concept of an attribute value for every attribute as a tree structure in accordance with a level of obfuscation, anonymous information in which one or more personal information are anonymized using the generalization hierarchy tree, and a number of personal information in which an attribute value occurs for every attribute value of each attribute as inputs, and
by using a node frequency obtaining unit that in the case of a leaf, counts the occurrence frequencies of nodes of the generalization hierarchy tree as a number of original personal information in which an attribute value indicated by the leaf occurs and in the case of an internal node, counts the occurrence frequencies of nodes of the generalization hierarchy tree as a total frequency of nodes which are grandchildren of an external node and leaves, outputs a value obtained by replacing each of the attribute values of each attribute of the anonymous information of the inputs with an attribute value of a leaf c with a possibility of a frequency of c/a frequency of a for one or more leaves which are grandchildren of the attribute value when the attribute value is a node a of the generalization hierarchy tree.
Specification