System and method for processing data in a distributed architecture
First Claim
1. A method for processing data in a distributed architecture, the method comprising the steps of:
- receiving a work request that identifies at least one repository for processing, wherein the at least one identified repository is included in a plurality of repositories;
determining a repository type of the at least one repository;
determining a spider type for gathering information content from the at least one identified repository, wherein the spider type is determined based on the repository type;
gathering information content from the at least one repository identified in the work request;
registering the information content;
assigning the information content to at least one document identifier;
transmitting the work request regarding the information content to a first work queue;
processing the work request by generating a meta-document representation of at least a portion of the information content;
transmitting the meta-document representation to a second work queue; and
analyzing the meta-document representation.
3 Assignments
0 Petitions
Accused Products
Abstract
A system, method, and processor readable medium for processing data in a knowledge management system gathers information content and transmits a work request for the information content gathered. The information content may be registered with a content map and assigned a unique document identifier. A work queue processes the work requests. The processed information may then be transmitted to another work queue for further processing. Further processing may include categorization, full-text indexing, metrics extraction or other process. Control messages may be transmitted to one or more users providing a status of the work request. The information may be analyzed and further indexed. A progress statistics report may be generated for each of the processes performed on the document. The progress statistics may be provided in a record. A shared access to a central data structure representing the metrics history and taxonomy may be provided for all work queues via a CORBA service.
-
Citations
40 Claims
-
1. A method for processing data in a distributed architecture, the method comprising the steps of:
-
receiving a work request that identifies at least one repository for processing, wherein the at least one identified repository is included in a plurality of repositories; determining a repository type of the at least one repository; determining a spider type for gathering information content from the at least one identified repository, wherein the spider type is determined based on the repository type; gathering information content from the at least one repository identified in the work request; registering the information content; assigning the information content to at least one document identifier; transmitting the work request regarding the information content to a first work queue; processing the work request by generating a meta-document representation of at least a portion of the information content; transmitting the meta-document representation to a second work queue; and analyzing the meta-document representation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 11)
-
-
10. A system for processing data in a distributed architecture, the system comprising:
-
a scheduling module that identifies at least one repository for processing, determines a repository type of the at least one repository, determines a spider type for gathering information content from the at least one identified repository, and generates a work request that identifies the at least one repository for processing, wherein the spider type is determined based on the repository type, and the repository is included in a plurality of repositories; an information content gathering module that gathers information content from the repository identified in the work request; a registering module that registers the information content; an assigning module that assigns the information content at least one document identifier; a work request transmitting module that transmits the work request regarding the information content to a first work queue; a work request processing module that processes the work request by generating a meta-document representation of at least a portion of the information content; an information content transmitting module that transmits the meta-document representation to a second work queue; and an information content processing module that analyzes the meta-document representation. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A system for processing data in a distributed architecture, the system comprising:
-
scheduling means for identifying at least one repository for processing, determining a repository type of the at least one repository, determining a spider type for gathering information content from the at least one identified repository, and generating a work request that identifies the at least one repository for processing, wherein the spider type is determined based on the repository type, and the repository is included in a plurality of repositories; gathering means for gathering information content from the repository identified in with the work request; registering means for registering the information content; assigning means for assigning the information content at least one document identifier; work request transmitting means for transmitting the work request regarding the information content to a first work queue; work request processing means for processing the work request by generating a meta-document representation of at least a portion of the information content; information content transmitting means for transmitting the meta-document representation to a second work queue; and information content processing means for analyzing the meta-document representation. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A processor readable medium comprising processor readable code embodied therein for causing a processor to process data in a distributed architecture, the medium comprising:
-
work request receiving code that causes a processor to receive a work request that identifies at least one repository for processing, wherein the at least one identified repository is included in a plurality of repositories; repository type determining code that causes a processor to determine a repository type of the at least one repository; spider type determining code that causes a processor to determine a spider type for gathering information content from the at least one identified repository, wherein the spider type is determined based on the repository type; information content gathering code that causes a processor to gather information content from the repository identified in the work request; registering code that causes a processor to register the information content; assigning code that causes a processor to assign the information content at least one document identifier; work request transmitting code that causes a processor to transmit the work request regarding the information content to a first work queue; work request processing code that causes a processor to process the work request by generating a meta-document representation of at least a portion of the information content; information content transmitting code that causes a processor to transmit the meta-document representation to a second work queue; and information content processing code that causes a processor to analyze the meta-document representation. - View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40)
-
Specification