Systems and methods for general aggregation of characteristics and key figures
First Claim
1. A computer-implemented method for automated generic and parallel aggregation of characteristics and key figures of data associated with financial institutions and with financial affairs in banking practice, the method comprising:
- receiving, at a data processing system, mass data from a single database of a single data source or from different databases of different data sources, the mass data comprising a plurality of records, each record being associated with one or more granularity characteristics and one or more key figures;
selecting, according to a customer-defined aggregation, one or more of the granularity characteristics of the received mass data, one or more of the key figures of the received mass data, and an aggregation operation associated with each of the one or more key figures;
generating a plurality of data packages from the received mass data, each data package comprising a plurality of records, the plurality of records of each data package being smaller than the plurality of records of the received mass data;
processing, using one or more processors of the data processing system, the data packages to reduce a number of records in each of the data packages according to the customer-defined aggregation, wherein the processing comprises;
identifying one or more granularity levels, each of the granularity levels being associated with one of the selected granularity characteristics, and the identified granularity levels defining an order of the selected granularity characteristics;
sorting the records of each data package according to the defined order of granularity characteristics; and
aggregating the sorted records of each data package for each of the selected key figures using the selected aggregation operations, the aggregation reducing the plurality of records of each data package;
splitting each of the aggregated data packages into one or more sub data packages, wherein each sub data package of an aggregated data package comprises fewer records than the aggregated data package; and
identifying adjacent sub data packages by comparing, for each sub data package, a key of a first record of the sub data package with a key of a first record and a key of a last record of each of the other sub data packages, the identifying comprising computing a termination criterion;
key pos1,x∈
(key pos1,y;
key pos max;
y),wherein pos1 represents a first position of a sub data package, posmax represents a last position of a sub data package, and x and y represent numbers of sub data packages, wherein adjacent sub data packages are sub data packages having keys of the first record that are closest together and have violated the termination criterion; and
saving, to a memory of the data processing system, the processed records, wherein the stored records comprise fewer records than the received mass data at the customer-defined granularity.
2 Assignments
0 Petitions
Accused Products
Abstract
Computer-implemented methods, computer systems, and computer programs product are provided for automated generic and parallel aggregation of characteristics and key figures of unsorted mass data being of specific economic interest, particularly associated with financial institutions, and with financial affairs in banking practice. The parallel aggregation may reduce the amount of data for a customer defined granularity for the purpose of facilitating the handling of raw data related to all areas of credit risk management in banking practice. Moreover, the computing power of software and the software performance run time, respectively, may be improved in the case of mass data.
6 Citations
23 Claims
-
1. A computer-implemented method for automated generic and parallel aggregation of characteristics and key figures of data associated with financial institutions and with financial affairs in banking practice, the method comprising:
-
receiving, at a data processing system, mass data from a single database of a single data source or from different databases of different data sources, the mass data comprising a plurality of records, each record being associated with one or more granularity characteristics and one or more key figures; selecting, according to a customer-defined aggregation, one or more of the granularity characteristics of the received mass data, one or more of the key figures of the received mass data, and an aggregation operation associated with each of the one or more key figures; generating a plurality of data packages from the received mass data, each data package comprising a plurality of records, the plurality of records of each data package being smaller than the plurality of records of the received mass data; processing, using one or more processors of the data processing system, the data packages to reduce a number of records in each of the data packages according to the customer-defined aggregation, wherein the processing comprises; identifying one or more granularity levels, each of the granularity levels being associated with one of the selected granularity characteristics, and the identified granularity levels defining an order of the selected granularity characteristics; sorting the records of each data package according to the defined order of granularity characteristics; and aggregating the sorted records of each data package for each of the selected key figures using the selected aggregation operations, the aggregation reducing the plurality of records of each data package; splitting each of the aggregated data packages into one or more sub data packages, wherein each sub data package of an aggregated data package comprises fewer records than the aggregated data package; and identifying adjacent sub data packages by comparing, for each sub data package, a key of a first record of the sub data package with a key of a first record and a key of a last record of each of the other sub data packages, the identifying comprising computing a termination criterion;
key pos1,x∈
(key pos1,y;
key pos max;
y),wherein pos1 represents a first position of a sub data package, posmax represents a last position of a sub data package, and x and y represent numbers of sub data packages, wherein adjacent sub data packages are sub data packages having keys of the first record that are closest together and have violated the termination criterion; and saving, to a memory of the data processing system, the processed records, wherein the stored records comprise fewer records than the received mass data at the customer-defined granularity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 18, 19)
-
-
12. A computer system configured to perform automated generic and parallel aggregation of characteristics and key figures of data associated with financial institutions and with financial affairs in banking practice, comprising:
-
a module configured to receive mass data from a single database of a single data source or from different databases of different data sources, the mass data comprising a plurality of records, each record being associated with one or more granularity characteristics and one or more key figures; a module configured to select, according to a customer-defined aggregation, one or more of the granularity characteristics of the received mass data, one or more of the key figures of the received mass data, and an aggregation operation associated with each of the one or more key figures; a module configured to generate a plurality of data packages from the received mass data, each data package comprising a plurality of records, the plurality of records of each data packages being smaller than the plurality of records of the received mass data; one or more processors configured to process the data packages to reduce a number of records in each of the data packages according to the customer-defined aggregation, wherein the one or more processors are further configured to; identify one or more granularity levels, each of the granularity level being associated with one of the selected granularity characteristics, and the identified granularity levels defining an order of the selected granularity characteristics; sort the records of each data package according to the defined order of granularity characteristics; aggregate the sorted records of each data package for each of the selected key figures using the selected aggregation operations, the aggregation reducing the plurality of records of each data package; split each of the aggregated data packages into one or more sub data Packages, wherein each sub data package of an aggregated data package comprises fewer records than the aggregated data package; and identify adjacent sub data packages by comparing, for each sub data package, a key of a first record of the sub data package with a key of a first record and a key of a last record of each of the other sub data packages, the identifying comprising computing a termination criterion;
key pos1, x∈
(key posi,y;
key pos max;
y),wherein pos1 represents a first position of a sub data package, posmax represents a last position of a sub data package, and x and y represent numbers of sub data packages, wherein adjacent sub data packages are sub data packages having keys of the first record that are closest together and have violated the termination criterion; and a memory configured to store the processed records, Wherein the stored records comprise fewer records than the received mass data at the customer-defined granularity. - View Dependent Claims (13, 14, 20, 21)
-
-
15. A computer readable medium comprising a plurality of instructions that, when executed by a processor, perform a method for automated generic and parallel aggregation of characteristics and key figures of data associated with financial institutions and with financial affairs in banking practice, the method comprising:
-
receiving mass data from a single database of a single data source or from different databases of different data sources, the mass data comprising a plurality of records, each record being associated with one or more granularity characteristics and one or more key figures; selecting, according to a customer-defined aggregation, one or more of the granularity characteristics of the received mass data, one or more of the key figures of the received mass data, and an aggregation operation associated with each of the one or more key figures; generating a plurality of data packages from the received mass data, each data package comprising a plurality of records, the plurality of records of each data packages being smaller than the plurality of records of the received mass data; processing the data packages to reduce a number of records in each of the data packages according to the customer-defined aggregation, wherein the processing comprises; identifying one or more granularity levels, each of the granularity levels being associated with one of the selected granularity characteristics, and the identified granularity levels defining an order of the selected granularity characteristics; sorting the records of each data package according to the defined order of granularity characteristics; and aggregating the sorted records of each data package for each of the selected key figures using the selected aggregation operations, the aggregation reducing the plurality of records of each data package; splitting each of the aggregated data packages into one or more sub data packages, wherein each sub data package of an aggregated data package comprises fewer records than the aggregated data package; and identifying adjacent sub data packages by comparing, for each sub data package. a key of a first record of the sub data package with a key of a first record and a key of a last record of each of the other sub data packages, the identifying comprising computing a termination criterion;
key pos1 ,x∈
(key posi,y;
key pos max;
y),wherein pos1 represents a first position of a sub data package, posmax represents a last position of a sub data package. and x and y represent numbers of sub data packages, wherein adjacent sub data packages are sub data packages having keys of the first record that are closest together and have violated the termination criterion; and saving the processed records, wherein the stored records comprise fewer records than the received mass data at the customer-defined granularity. - View Dependent Claims (16, 17, 22, 23)
-
Specification