Computer implemented system and method for investigative data analytics
First Claim
1. A computer implemented analytical database server,said database server comprising:
- a data loader configured to receive data tables from a plurality of sources, said data loader further configured convert the received data into a tabular format to create a plurality of source data tables, said data loader configured to load the source data tables onto a Hadoop Distributed File System (HDFS);
a processor cooperating with said Hadoop Distributed File System to process the data loaded thereto, said processor further configured to reorganize the data and the corresponding metadata into at least one first data table, second data table and third data table, wherein;
said first data table comprises unique numerical identifiers assigned to respective source data tables, said first data table further comprising unique first identifiers corresponding to each of the columns of respective source data tables, said unique first identifiers linked to at least one of said unique numerical identifiers, said first data table further comprising the original names of respective source data tables;
said second data table comprising a plurality of second identifiers mapped to the respective first identifiers stored in said first table, a plurality of third identifiers mapped to the respective numerical identifiers stored in said first table, said second data table further configured to store the data reorganized by the processor, wherein the data reorganized by the processor is linked to said second identifiers and said third identifiers;
said third data table configured to store the unique numerical identifiers representing respective source data tables, in the form of unidirectional relationship notations; and
a query builder configured to receive at least one keyword from a user, said query builder further configured to;
query said first table to identify the source data table relevant to said keyword;
determine from said first table, the unique numerical identifier corresponding to the identified source data table;
identify from said first table, the first identifiers linked to the identified unique numerical identifier; and
query the second data table to identify the second identifiers mapped onto said first identifiers, and query the second data table to identify the third identifiers mapped to the identified numerical identifier;
query the second data table to elicit the data linked to said second identifiers and third identifiers, and display elicited data to the user along with information including the source data table name and source data table column name, in which the elicited data was located;
said query builder further configured to dynamically modify the generated query in response to the keywords provided by said user.
0 Assignments
0 Petitions
Accused Products
Abstract
An analytical database server and a method for enabling investigative data analytics have been disclosed. The database server comprises a data loader that receives data from a plurality of data sources, and loads the received source data tables onto a Hadoop Distributed File System (HDFS). A processor processes the source data tables loaded onto the HDFS and assigns a unique turf row (TR) identifier to each of the records present in the source data tables. The source data tables and the corresponding metadata are organized into a first data table, a second data table and a third data table. The first table comprises a record for each of the received source data tables. The second data table stores, in the form of an inverted list, the data originally contained in the received source data tables. The third data table stores the unidirectional relationships between the source data tables.
-
Citations
9 Claims
-
1. A computer implemented analytical database server,
said database server comprising: -
a data loader configured to receive data tables from a plurality of sources, said data loader further configured convert the received data into a tabular format to create a plurality of source data tables, said data loader configured to load the source data tables onto a Hadoop Distributed File System (HDFS); a processor cooperating with said Hadoop Distributed File System to process the data loaded thereto, said processor further configured to reorganize the data and the corresponding metadata into at least one first data table, second data table and third data table, wherein; said first data table comprises unique numerical identifiers assigned to respective source data tables, said first data table further comprising unique first identifiers corresponding to each of the columns of respective source data tables, said unique first identifiers linked to at least one of said unique numerical identifiers, said first data table further comprising the original names of respective source data tables; said second data table comprising a plurality of second identifiers mapped to the respective first identifiers stored in said first table, a plurality of third identifiers mapped to the respective numerical identifiers stored in said first table, said second data table further configured to store the data reorganized by the processor, wherein the data reorganized by the processor is linked to said second identifiers and said third identifiers; said third data table configured to store the unique numerical identifiers representing respective source data tables, in the form of unidirectional relationship notations; and a query builder configured to receive at least one keyword from a user, said query builder further configured to; query said first table to identify the source data table relevant to said keyword; determine from said first table, the unique numerical identifier corresponding to the identified source data table; identify from said first table, the first identifiers linked to the identified unique numerical identifier; and
query the second data table to identify the second identifiers mapped onto said first identifiers, and query the second data table to identify the third identifiers mapped to the identified numerical identifier;query the second data table to elicit the data linked to said second identifiers and third identifiers, and display elicited data to the user along with information including the source data table name and source data table column name, in which the elicited data was located; said query builder further configured to dynamically modify the generated query in response to the keywords provided by said user. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer implemented method for enabling investigative data analytics, said method comprising the following computer implemented steps:
-
receiving data from a plurality of sources, using a data loader, and converting the received data into a tabular format to create a plurality of source data tables; reorganizing, using a processor, the data and the corresponding metadata into at least one first data table, second data table and third data table; assigning unique numerical identifiers to respective source data tables;
assigning unique first identifiers to each of the columns of respective source data tables, and linking said first identifiers to at least one of said unique numerical identifiers;storing said numerical identifiers, said first identifiers, and original names of respective source data tables, in said first data table; mapping a plurality of second identifiers to the respective first identifiers stored in said first table, and mapping a plurality of third identifiers to the respective numerical identifiers stored in said first table; storing said second identifiers, said third identifiers and the data reorganized by the processor, in said second data table, and linking the data reorganized by the processor to said second identifiers and said third identifiers; storing said numerical identifiers representing respective source data tables, in the form of unidirectional relationship notations, in said third data table; receiving at least one keyword in the form of a query from a user; querying said first table, using a query builder, and identifying the source data table relevant to said keyword; determining from said first table, the numerical identifier corresponding to the identified source data table;
identifying from said first table, the first identifiers linked to the identified unique numerical identifier;querying the second data table to identify the second identifiers mapped onto said first identifiers, and querying the second data table to identify the third identifiers mapped to the identified numerical identifier; querying the second data table to elicit the data linked to said second identifiers and third identifiers; and displaying elicited data to the user along with information including the source data table name and source data table column name in which the elicited data was located. - View Dependent Claims (7, 8, 9)
-
Specification