Electronic content curating mechanisms
First Claim
1. A method, in a data processing system, for managing an electronic document collection, comprising:
- analyzing a first electronic document to identify a reference to a second electronic document;
analyzing the second electronic document to identify document dependencies with zero or more other electronic documents;
generating a dependency information data structure based on the analysis of the first electronic document and the analysis of the second electronic document, wherein the dependency information data structure comprises a dependency graph data structure of the electronic document collection, the dependency graph data structure comprising first nodes representing electronic documents in the electronic document collection, second nodes representing authors of electronic documents in the electronic document collection, and edges between nodes representing relationships between nodes, wherein each of the first nodes and the second nodes have an associated node strength attribute, and wherein the associated node strength attribute is a measure of a relative importance of the associated first node or the associated second node to the dependency graph data structure of the electronic document collection and a fragility of the dependency graph data structure with regard to the associated first node or the associated second node;
analyzing the dependency information data structure to identify a loaded document subset of the electronic document collection that is a subset of electronic documents to be loaded into memory when performing an information analysis operation;
generating an electronic document curation action recommendation based on the identified subset of the electronic document collection; and
outputting the electronic document curation action recommendation.
1 Assignment
0 Petitions
Accused Products
Abstract
Mechanisms for managing an electronic document collection are provided. A first electronic document is analyzed to identify a reference to a second electronic document and the second electronic document is analyzed to identify document dependencies with zero or more other electronic documents. A dependency information data structure is generated based on the analysis. The dependency information data structure is analyzed to identify a subset of the electronic document collection that is to be loaded into memory when performing an information analysis operation. An electronic document curation action recommendation is generated based on the identified subset of the electronic document collection. The electronic document curation action recommendation is then output.
-
Citations
21 Claims
-
1. A method, in a data processing system, for managing an electronic document collection, comprising:
-
analyzing a first electronic document to identify a reference to a second electronic document; analyzing the second electronic document to identify document dependencies with zero or more other electronic documents; generating a dependency information data structure based on the analysis of the first electronic document and the analysis of the second electronic document, wherein the dependency information data structure comprises a dependency graph data structure of the electronic document collection, the dependency graph data structure comprising first nodes representing electronic documents in the electronic document collection, second nodes representing authors of electronic documents in the electronic document collection, and edges between nodes representing relationships between nodes, wherein each of the first nodes and the second nodes have an associated node strength attribute, and wherein the associated node strength attribute is a measure of a relative importance of the associated first node or the associated second node to the dependency graph data structure of the electronic document collection and a fragility of the dependency graph data structure with regard to the associated first node or the associated second node; analyzing the dependency information data structure to identify a loaded document subset of the electronic document collection that is a subset of electronic documents to be loaded into memory when performing an information analysis operation; generating an electronic document curation action recommendation based on the identified subset of the electronic document collection; and outputting the electronic document curation action recommendation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer program product comprising a non-transitory computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a computing device, causes the computing device to:
-
analyze a first electronic document to identify a reference to a second electronic document; analyze the second electronic document to identify document dependencies with zero or more other electronic documents; generate a dependency information data structure based on the analysis of the first electronic document and the analysis of the second electronic document, wherein the dependent information data structure comprises a dependency graph data structure of the electronic document collection, the dependency graph data structure comprising first nodes representing electronic documents in the electronic document collection, second nodes representing authors of electronic documents in the electronic document collection, and edges between nodes representing relationships between nodes, wherein each of the first nodes and the second nodes have an associated node strength attribute, and wherein the associated node strength attribute is a measure of a relative importance of the associated first node or the associated second node to the dependency graph data structure of the electronic document collection and a fragility of the dependency graph data structure with regard to the associated first node or the associated second node; analyze the dependency information data structure to identify a loaded document subset of the electronic document collection that is a subset of electronic documents to be loaded into memory when performing an information analysis operation; generate an electronic document curation action recommendation based on the identified subset of the electronic document collection; and output the electronic document curation action recommendation. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. An apparatus, comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to; analyze a first electronic document to identify a reference to a second electronic document; analyze the second electronic document to identify document dependencies with zero or more other electronic documents; generate a dependency information data structure based on the analysis of the first electronic document and the analysis of the second electronic document, wherein the dependency information data structure comprises a dependency graph data structure of the electronic document collection, the dependency graph data structure comprising first nodes representing electronic documents in the electronic document collection, second nodes representing authors of electronic documents in the electronic document collection and edges between nodes representing relationships between nodes, wherein each of the first nodes and the second nodes have an associated node strength attribute, and wherein the associated node strength attribute is a measure of a relative importance of the associated first node or the associated second node to the dependency graph data structure of the electronic document collection and a fragility of the dependency graph data structure with regard to the associated first node or the associated second node; analyze the dependency information data structure to identify a loaded document subset of the electronic document collection that is a subset of electronic documents to be loaded into memory when performing an information analysis operation; generate an electronic document curation action recommendation based on the identified subset of the electronic document collection; and output the electronic document curation action recommendation.
-
Specification