System and method for automating data normalization using text analytics
First Claim
Patent Images
1. An automated data enhancement processing system for processing database data, comprising a processor and a memory, the memory including:
- a system for ingesting database data structured in at least two disparate predefined database formats used by at least two different database management systems (DBMS),wherein the structured data are elements of at least one relational database, the elements representing values of well-defined entities;
a set of automated text analytics processes that, when applied to the ingested database data, treat the structured ingested database data as unstructured data to generate normalized data described by consistent and structured metadata,wherein the set of text analytics processes apply a shared set of metadata tags to the structured database data to describe the normalized data;
a system for processing the normalized data, wherein the processing includes translating and expanding the normalized data into a consistent structure which is ready for mining and OLAP (online analytical processing) applications; and
a system for generating mining and OLAP results using the normalized data.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method and program product for normalizing, sanitizing and disambiguating structured data. Structured data includes data stored in a database management system (DBMA), as well labeled files (e.g., XML data). An automated data enhancement processing system is provided, comprising: a system for ingesting data structured in at least one predefined database format; and a set of text analytics processes that treat the ingested data as unstructured, and generate normalized data represented and indexed by consistent, structured metadata.
34 Citations
19 Claims
-
1. An automated data enhancement processing system for processing database data, comprising a processor and a memory, the memory including:
-
a system for ingesting database data structured in at least two disparate predefined database formats used by at least two different database management systems (DBMS), wherein the structured data are elements of at least one relational database, the elements representing values of well-defined entities; a set of automated text analytics processes that, when applied to the ingested database data, treat the structured ingested database data as unstructured data to generate normalized data described by consistent and structured metadata, wherein the set of text analytics processes apply a shared set of metadata tags to the structured database data to describe the normalized data; a system for processing the normalized data, wherein the processing includes translating and expanding the normalized data into a consistent structure which is ready for mining and OLAP (online analytical processing) applications; and a system for generating mining and OLAP results using the normalized data. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A program product stored on a recordable medium, which when executed by a computer, normalizes and processes database data representation and metadata, comprising:
-
program code configured for ingesting database data structured in at least two disparate predefined database formats used by at least two different database management systems (DBMS), wherein the structured data are elements of at least one relational database, the elements representing values of well-known entities; program code providing a set of automated text analytics processes that, when applied to the ingested database data, treat the ingested database data as unstructured data, and generates normalized data described by consistent and structured metadata, wherein the set of text analytics processes apply a shared set of metadata tags to describe the normalized data; program code for processing the normalized data, wherein the processing includes translating and expanding the normalized data into a consistent structure which is ready for mining and OLAP (online analytical processing) applications; and program code for generating mining and OLAP results using the normalized data. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A method for normalizing and processing database data, comprising:
-
ingesting database data structured in at least two disparate predefined database formats used by at least two different database management systems (DBMS), wherein the structured data are elements of at least one relational database, the elements representing values of well-defined entities; performing a set of automated text analytics processes on the ingested database data to generate normalized data described by a consistent and structured metadata, wherein each of the text analytics processes treat the ingested database data as unstructured data, and wherein the set of text analytics processes apply a shared set of metadata tags to describe the normalized data; processing the normalized data, wherein the processing includes translating and expanding the normalized data into a consistent structure which is ready for mining and OLAP (online analytical processing) applications; and generating mining and OLAP results using the normalized data. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A method for deploying an automated data enhancement processing system for normalizing and processing database data, comprising:
providing a computer infrastructure being operable to; ingest database data structured in at least two disparate predefined database formats used by at least two different database management systems (DBMS), wherein the structured data are elements of at least one relational database, the elements representing values of well-defined entities; perform a set of automated text analytics processes on the ingested database data to generate normalized data described by a consistent, indexed and structured metadata, wherein each of the text analytics processes treat the ingested data as unstructured data, and wherein the set of text analytics processes apply a shared set of metadata tags to describe the normalized data; process the normalized data, wherein the processing includes translating and expanding the normalized data into a consistent structure which is ready for mining and OLAP (online analytical processing) applications; and generate mining and OLAP results using the normalized data.
Specification