System and method for cataloguing digital information for searching and retrieval
First Claim
Patent Images
1. A system for automatically cataloguing documents located in multiple heterogeneous repositories, the system comprising:
- a scanning tool for scanning the multiple heterogeneous repositories to collect keywords for the documents located therein;
a keyword index to the documents built using the collected keywords;
a mapping toolfor mapping the documents using the keyword index to one or more classes, each of the one or more classes including keywords representative of that class; and
a computing device for creating metadata indicative of each of the documents as defined by each of the documents'"'"' keywords and one or more classes and cataloguing each of the documents in an integrated library according to the metadata in a meta-index, wherein the meta-index retains the characteristics of each of the multiple heterogeneous repositories as applied to each of the documents such that a user may access one or more of the documents within the multiple heterogeneous repositories utilizing the meta-index; and
further wherein the characteristics of the multiple heterogeneous repositories are transparent to the user when one or more of the documents are accessed using the meta-index.
6 Assignments
0 Petitions
Accused Products
Abstract
The system and method for searching and retrieving information stored in heterogeneous information repositories. A portal server retrieves user requests through a computer network and looks up information stored in a metadata databases. For example, the metadata may be encoded in an XML/RDF format and stored in a directory server to facilitate effective searching and retrieval of information from an information repository. Metadata includes information including a classmark definition for each document. The classmark is determined through an automated cataloguing process.
-
Citations
14 Claims
-
1. A system for automatically cataloguing documents located in multiple heterogeneous repositories, the system comprising:
-
a scanning tool for scanning the multiple heterogeneous repositories to collect keywords for the documents located therein;
a keyword index to the documents built using the collected keywords;
a mapping toolfor mapping the documents using the keyword index to one or more classes, each of the one or more classes including keywords representative of that class; and
a computing device for creating metadata indicative of each of the documents as defined by each of the documents'"'"' keywords and one or more classes and cataloguing each of the documents in an integrated library according to the metadata in a meta-index, wherein the meta-index retains the characteristics of each of the multiple heterogeneous repositories as applied to each of the documents such that a user may access one or more of the documents within the multiple heterogeneous repositories utilizing the meta-index; and
further wherein the characteristics of the multiple heterogeneous repositories are transparent to the user when one or more of the documents are accessed using the meta-index.- View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for automatically cataloguing documents located in multiple heterogeneous repositories, comprising:
-
scanning the multiple heterogeneous repositories to collect keywords from the documents located therein;
building a keyword index to the documents stored in the multiple heterogeneous repositories using the collected keywords;
mapping the documents using the keyword index into predetermined classes, wherein the mapping is performed using at least one mapping tool;
creating metadata information, including identification of the predetermined class, for the documents; and
cataloguing each of the documents in an integrated library according to the metadata in a meta-index, wherein the meta-index retains the characteristics of each of the multiple heterogeneous repositories as applied to each of the documents such that a user may access one or more of the documents within the multiple heterogeneous repositories utilizing the meta-index. - View Dependent Claims (7, 8, 9)
-
-
10. A system for automatically cataloguing documents located in multiple heterogeneous repositories, the system comprising:
-
means for scanning the multiple heterogeneous to collect keywords from the documents located therein;
means for building a keyword index to the documents stored in the multiple heterogeneous repositories using the collected keywords;
means for mapping the documents using the keyword index into predetermined classes, wherein the mapping is performed using at least one mapping tool;
means for creating metadata information, including identification of the predetermined class, for the documents; and
means for cataloguing each of the documents in an integrated library according to the metadata in a meta-index, wherein the meta-index retains the characteristics of each of the multiple heterogeneous repositories as applied to each of the documents such that a user accesses one or more of the documents within the multiple heterogeneous repositories utilizing the meta-index.
-
-
11. A method for automatically cataloguing electronic documents located in multiple digital heterogeneous libraries comprising:
-
scanning each of the multiple digital heterogeneous libraries to ascertain identifying characteristics of the electronic documents located therein;
building an index to each of the electronic documents based on the identifying characteristics;
mapping each of the electronic documents to at least one predetermined class based on a comparison of the index of identifying characteristics to the keywords of a classification hierarchy associated with the at least one predetermined class; and
cataloguing each of the electronic documents into at least one predetermined class within an integrated library according to the comparison, wherein the integrated library retains the characteristics of each of the multiple heterogeneous libraries as applied to each of the electronic documents such that a user accesses one or more of the electronic documents within the multiple heterogeneous libraries utilizing the integrated library. - View Dependent Claims (12, 13, 14)
-
Specification