Method and apparatus for categorizing documents containing sensitive information
First Claim
Patent Images
1. A method comprising:
- determining one or more probabilities that a document belongs to one or more of a plurality of predefined categories;
determining whether at least one of the one or more probabilities satisfies a first predetermined threshold;
causing the document to be used in data loss detection when at least one of the one or more probabilities satisfies the first predetermined threshold;
if at least one of the one or more probabilities does not satisfy the first predetermined threshold, determining, by a processor, whether at least one of the one or more probabilities satisfies a second predetermined threshold specific to a source of the document; and
causing the document to be used in data loss detection when at least one of the one or more probabilities satisfies the second predetermined threshold.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for determining whether a document is to be protected is described. In one embodiment, a computer system identifies a document to be categorized. The computer system then determines one or more probabilities that the document belongs to one or more of a plurality of predefined categories, the probabilities based on profiles of the predefined categories. The computer system then determines whether the probabilities indicate that the document is to be protected, and, if the document is to be protected, causes the document to be used in data loss detection.
24 Citations
20 Claims
-
1. A method comprising:
-
determining one or more probabilities that a document belongs to one or more of a plurality of predefined categories; determining whether at least one of the one or more probabilities satisfies a first predetermined threshold; causing the document to be used in data loss detection when at least one of the one or more probabilities satisfies the first predetermined threshold; if at least one of the one or more probabilities does not satisfy the first predetermined threshold, determining, by a processor, whether at least one of the one or more probabilities satisfies a second predetermined threshold specific to a source of the document; and causing the document to be used in data loss detection when at least one of the one or more probabilities satisfies the second predetermined threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer readable storage medium that provides instructions, which when executed on a processor cause the processor to perform a method comprising:
-
determining one or more probabilities that a document belongs to one or more of a plurality of predefined categories; determining whether at least one of the one or more probabilities satisfies a first predetermined threshold; causing the document to be used in data loss detection when at least one of the one or more probabilities satisfies the first predetermined threshold; if at least one of the one or more probabilities does not satisfy the first predetermined threshold, determining, by a the processor, whether at least one of the one or more probabilities satisfies a second predetermined threshold specific to a source of the document; and causing the document to be used in data loss detection when at least one of the one or more probabilities satisfies the second predetermined threshold. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A system, comprising:
-
a memory; and a processor coupled with the memory to determine one or more probabilities that a document belongs to one or more of a plurality of predefined categories, determine whether at least one of the one or more probabilities satisfies a first predetermined threshold, cause the document to be used in data loss detection when at least one of the one or more probabilities satisfies the first predetermined threshold, if at least one of the one or more probabilities does not satisfy the first predetermined threshold, determine whether at least one of the one or more probabilities satisfies a second predetermined threshold specific to a source of the document, and cause the document to be used in data loss detection when at least one of the one or more probabilities satisfies the second predetermined threshold. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification