APPARATUS AND METHODS OF IDENTIFYING POTENTIALLY SIMILAR CONTENT FOR DATA REDUCTION
First Claim
Patent Images
1. A computer-implemented method of identifying potentially similar content for data reduction, comprising:
- receiving content workflow metadata corresponding to content to be processed, wherein the content to be processed includes a data component, and wherein the content workflow metadata represents workflow processing information corresponding to the data component;
receiving known content workflow metadata corresponding to a first plurality of known content, wherein each known content includes a known data component, and wherein the known content workflow metadata represents workflow processing information corresponding to each respective known data component;
determining a potential similarity between the data component of the content to be processed and at least one known data component of at least one of the first plurality of known content based on a similarity between the respective content workflow metadata and the respective known content workflow metadata; and
outputting an identification of potentially similar content, based on the determined potential similarity, for use in reducing data in the content to be processed.
12 Assignments
0 Petitions
Accused Products
Abstract
Apparatus and methods of identifying potentially similar content include utilizing workflow metadata to identify potential similarities in content to be processed, or between content to be processed and known content. As a result, a subset of potentially similar content is identified, and the subset can be used in data reduction operations to reduce data in the content to be processed.
101 Citations
41 Claims
-
1. A computer-implemented method of identifying potentially similar content for data reduction, comprising:
-
receiving content workflow metadata corresponding to content to be processed, wherein the content to be processed includes a data component, and wherein the content workflow metadata represents workflow processing information corresponding to the data component; receiving known content workflow metadata corresponding to a first plurality of known content, wherein each known content includes a known data component, and wherein the known content workflow metadata represents workflow processing information corresponding to each respective known data component; determining a potential similarity between the data component of the content to be processed and at least one known data component of at least one of the first plurality of known content based on a similarity between the respective content workflow metadata and the respective known content workflow metadata; and outputting an identification of potentially similar content, based on the determined potential similarity, for use in reducing data in the content to be processed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer program product configured to identify potentially similar content for data reduction, comprising:
a computer-readable medium comprising; at least one set of instructions operable to cause a computer to receive content workflow metadata corresponding to content to be processed, wherein the content to be processed includes a data component, and wherein the content workflow metadata represents workflow processing information corresponding to the data component; at least one set of instructions operable to cause the computer to receive known content workflow metadata corresponding to a first plurality of known contents, wherein each known content includes a known data component, and wherein the known content workflow metadata represents workflow processing information corresponding to each respective known data component; at least one set of instructions operable to cause the computer to determine a potential similarity between the data component of the content to be processed and at least one known data component of at least one of the first plurality of known contents based on a potential similarity between the respective content workflow metadata and the respective known content workflow metadata; and at least one set of instructions operable to cause the computer to output an identification of potentially similar content, based on the determined potential similarity, for use in reducing data in the content to be processed.
-
21. At least one processor configured to identify potentially similar content for data reduction, comprising:
-
a first module for receiving content workflow metadata corresponding to content to be processed, wherein the content to be processed includes a data component, and wherein the content workflow metadata represents workflow processing information corresponding to the data component; a second module for receiving known content workflow metadata corresponding to a first plurality of known contents, wherein each known content includes a known data component, and wherein the known content workflow metadata represents workflow processing information corresponding to each respective known data component; a third module for determining a potential similarity between the data component of the content to be processed and at least one known data component of at least one of the first plurality of known contents based on a potential similarity between the respective content workflow metadata and the respective known content workflow metadata; and a fourth module for outputting an identification of potentially similar content, based on the determined potential similarity, for use in reducing data in the content to be processed.
-
-
22. A computing device for identifying potentially similar content for data reduction, comprising:
-
means for receiving content workflow metadata corresponding to content to be processed, wherein the content to be processed includes a data component, and wherein the content workflow metadata represents workflow processing information corresponding to the data component; means for receiving known content workflow metadata corresponding to a first plurality of known contents, wherein each known content includes a known data component, and wherein the known content workflow metadata represents workflow processing information corresponding to each respective known data component; means for determining a potential similarity between the data component of the content to be processed and at least one known data component of at least one of the first plurality of known contents based on a potential similarity between the respective content workflow metadata and the respective known content workflow metadata; and means for outputting an identification of potentially similar content, based on the determined potential similarity, for use in reducing data in the content to be processed.
-
-
23. A computing device for identifying potentially similar content for data reduction, comprising:
-
a communications module operable to receive content workflow metadata corresponding to content to be processed, wherein the content to be processed includes a data component, and wherein the content workflow metadata represents workflow processing information corresponding to the data component; wherein the communications module is further operable to receive known content workflow metadata corresponding to a first plurality of known content, wherein each known content includes a known data component, and wherein the known content workflow metadata represents workflow processing information corresponding to each respective known data component; a similarity identifier module having one or more similarity rules operable to determine a potential similarity between the data component of the content to be processed and at least one known data component of at least one of the first plurality of known content based on a potential similarity between the respective content workflow metadata and the respective known content workflow metadata; and wherein the similarity identifier component is further operable to output an identification of potentially similar content, based on the determined potential similarity, for use in reducing data in the content to be processed. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41)
-
Specification