Use of federation services and transformation services to perform extract, transform, and load (ETL) of unstructured information and associated metadata
First Claim
Patent Images
1. A computer-implemented method for transforming unstructured information into content in a uniform context, comprising:
- with a federation service of a computer including a processor that presents a single view of source content repositories to a user;
receiving a query specifying source content groups stored in a set of the source content repositories;
running the query to retrieve metadata schemas of the source content groups, wherein each source content group has a metadata schema that describes a structure of metadata associated with the unstructured information in the source content group;
extracting the unstructured information and metadata associated with the unstructured information from the set of the source content repositories;
in response to user input, receiving selection of target content groups in another set of target content repositories;
in response to receiving the selection of the target content groups, identifying metadata schemas of the target content groups, wherein each metadata schema describes a structure of metadata associated with the unstructured information in the target content group;
creating a schema definition file including the retrieved metadata schemas of the source content groups and the identified metadata schemas of the target content groups;
forwarding the unstructured information, metadata, and schema definition file to a transformation service of the computer;
receiving, from the transformation service, transformed unstructured information and transformed metadata; and
loading the transformed, unstructured information and the transformed metadata into the set of the target content repositories.
2 Assignments
0 Petitions
Accused Products
Abstract
Provided are techniques for transforming unstructured information into content in a uniform context. The unstructured information and metadata associated with the unstructured information are extracted from one or more source content repositories. One or more custom transformations are performed on at least one of the unstructured information and the metadata. At least one of the transformed, unstructured information and the metadata are loaded into one or more target content repositories.
39 Citations
27 Claims
-
1. A computer-implemented method for transforming unstructured information into content in a uniform context, comprising:
with a federation service of a computer including a processor that presents a single view of source content repositories to a user; receiving a query specifying source content groups stored in a set of the source content repositories; running the query to retrieve metadata schemas of the source content groups, wherein each source content group has a metadata schema that describes a structure of metadata associated with the unstructured information in the source content group; extracting the unstructured information and metadata associated with the unstructured information from the set of the source content repositories; in response to user input, receiving selection of target content groups in another set of target content repositories; in response to receiving the selection of the target content groups, identifying metadata schemas of the target content groups, wherein each metadata schema describes a structure of metadata associated with the unstructured information in the target content group; creating a schema definition file including the retrieved metadata schemas of the source content groups and the identified metadata schemas of the target content groups; forwarding the unstructured information, metadata, and schema definition file to a transformation service of the computer; receiving, from the transformation service, transformed unstructured information and transformed metadata; and loading the transformed, unstructured information and the transformed metadata into the set of the target content repositories. - View Dependent Claims (2, 3, 4, 5, 6, 7, 11)
-
8. A computer program product comprising a computer readable storage medium including a computer readable program, wherein the computer readable program when executed by a processor on a computer causes the computer to:
-
with a federation service of the computer that presents a single view of source content repositories to a user; receive a query specifying source content groups stored in a set of the source content repositories; run the query to retrieve metadata schemas of the source content groups, wherein each source content group has a metadata schema that describes a structure of metadata associated with the unstructured information in the source content group; extract the unstructured information and metadata associated with the unstructured information from the set of the source content repositories; in response to user input, receive selection of target content groups in another set of target content repositories; in response to receiving the selection of the target content groups, identify metadata schemas of the target content groups, wherein each metadata schema describes a structure of metadata associated with the unstructured information in the target content group; create a schema definition file including the retrieved metadata schemas of the source content groups and the identified metadata schemas of the target content groups; forward the unstructured information, metadata, and schema definition file to a transformation service of the computer; receive, from the transformation service, transformed unstructured information and transformed metadata; and load the transformed, unstructured information and the metadata into the set of the target content repositories. - View Dependent Claims (9, 10, 12, 13, 14)
-
-
15. A system for transforming unstructured information into content in a uniform context, comprising:
hardware logic implemented in a computer to perform operations, the operations comprising; with a federation service of the computer that presents a single view of source content repositories to a user; receiving a query specifying source content groups stored in a set of the source content repositories; running the query to retrieve metadata schemas of the source content groups, wherein each source content group has a metadata schema that describes a structure of metadata associated with the unstructured information in the source content group; extracting the unstructured information and metadata associated with the unstructured information from the set of the source content repositories; in response to user input, receiving selection of target content groups in another set of target content repositories; in response to receiving the selection of the target content groups, identifying metadata schemas of the target content groups, wherein each metadata schema describes a structure of metadata associated with the unstructured information in the target content group; creating a schema definition file including the retrieved metadata schemas of the source content groups and the identified metadata schemas of the target content groups; forwarding the unstructured information, metadata, and schema definition file to a transformation service of the computer; receiving, from the transformation service, transformed unstructured information and transformed metadata; and loading the transformed, unstructured information and the metadata into the set of the target content repositories. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
22. A computer-implemented method for transforming unstructured information and associated metadata into content in a uniform context, comprising:
-
using a federation service of a computer including a processor that presents a single view of source content repositories to a user; receiving a query specifying source content groups in a set of the source content repositories; running the query to retrieve metadata schemas of the source content groups specified in the query, wherein each source content group has a metadata schema that describes a structure of metadata associated with the unstructured information in the source content group; extracting the unstructured information and metadata associated with the unstructured information from the set of the source content repositories; in response to user input, receiving selection of target content groups in another set of target content repositories; in response to receiving the selection of the target content groups, identifying metadata schemas of the target content groups, wherein each metadata schema describes a structure of metadata associated with the unstructured information in a target content group; creating a schema definition file including the extracted metadata schemas of the source content groups and the identified metadata schemas of the target content groups; forwarding the unstructured information, metadata, and schema definition file to a transformation service of the computer; using the transformation service, performing one or more custom mappings on at least one of the unstructured information and the associated metadata by mapping elements of the extracted metadata schemas of the source content groups to the identified metadata schemas of the target content groups; transforming at least one of the unstructured information and the associated metadata with custom transformations; and forwarding the mapped and transformed unstructured information and the associated metadata to the federation service. - View Dependent Claims (23)
-
-
24. A computer program product comprising a computer readable storage medium storing a computer readable program, wherein the computer readable program when executed by a processor on a computer causes the computer to:
using a federation service of the computer that presents a single view of source content repositories to a user; receive a query specifying source content groups in a set of the source content repositories; run the query to retrieve metadata schemas of the source content groups specified in the query, wherein each source content group has a metadata schema that describes a structure of metadata associated with the unstructured information in the source content group; extract the unstructured information and metadata associated with the unstructured information from the set of the source content repositories; in response to user input, receive selection of target content groups in another set of target content repositories; in response to receiving the selection of the target content groups, identify metadata schemas of the target content groups, wherein each metadata schema describes a structure of metadata associated with the unstructured information in a target content group; creating a schema definition file including the extracted metadata schemas of the source content groups and the identified metadata schemas of the target content groups; forwarding the unstructured information, metadata, and schema definition file to a transformation service of the computer; and using the transformation service, perform one or more custom mappings on at least one of the unstructured information and the associated metadata by mapping elements of the extracted metadata schemas of the source content groups to the identified metadata schemas of the target content groups; transform at least one of the unstructured information and the associated metadata with custom transformations; and forward the mapped and transformed unstructured information and the associated metadata to the federation service. - View Dependent Claims (25)
-
26. A system for transforming unstructured information and associated metadata into content in a uniform context, comprising:
hardware logic implemented in a computer to perform operations, the operations comprising; using a federation service of the computer that presents a single view of source content repositories to a user; receiving a query specifying source content groups in a set of the source content repositories; running the query to retrieve metadata schemas of the source content groups specified in the query, wherein each source content group has a metadata schema that describes structure of metadata associated with the unstructured information in the source content group; extracting the unstructured information and metadata associated with the unstructured information from the set of the source content repositories; in response to user input, receiving selection of target content groups in another set of target content repositories; in response to receiving the selection of the target content groups, identifying metadata schemas of the target content groups, wherein each metadata schema describes a structure of metadata associated with the unstructured information in a target content group; creating a schema definition file including the extracted metadata schemas of the source content groups and the identified metadata schemas of the target content groups; forwarding the unstructured information, metadata, and schema definition file to a transformation service of the computer; and using the transformation service, performing one or more custom mappings on at least one of the unstructured information and the associated metadata by mapping elements of the extracted metadata schemas of the source content groups to the identified metadata schemas of the target content groups; transforming at least one of the unstructured information and the associated metadata with custom transformations; and forwarding the mapped and transformed unstructured information and the associated metadata to the federation service. - View Dependent Claims (27)
Specification