Scoring of insurance data
First Claim
1. A system comprising:
- a master node comprising a memory;
a plurality of nodes in at least one cluster connected to the master node; and
one or more computing devices connected to the master node,wherein the master node is configured to;
create an input table in the memory;
load, into the input table, insurance data associated with a plurality of customers;
create a first file directory path to an insurance scoring script stored on at least one of the one or more computing devices, a second file directory path to a first predictive model stored on at least one of the one or more computing devices, and a third file directory path to a second predictive model stored on at least one of the one or more computing devices;
create a results table in the memory, the results table including the input table, a first additional column, and a second additional column;
assign, using a complementary group rating model and the insurance data, the plurality of customers to a respective tier among a plurality of tiers;
call a function of a Hive module that;
divides the insurance data into a plurality of data portions, wherein the insurance data is divided into the plurality of data portions based on a respective tier such that each of the plurality of data portions comprises a portion of the insurance data that is associated with one or more customers in a same tier, and wherein the master node delivers each of the plurality of data portions to a respective one of the plurality of nodes; and
instructs each of the plurality of nodes to execute, in parallel, the insurance scoring script to sequentially apply the first predictive model and the second predictive model to a specific portion of the plurality of data portions to generate a first plurality of scored results and a second plurality of scored results, respectively, wherein the first predictive model and the second predictive model are different, wherein at least one of the first predictive model and the second predictive model comprises a generalized linear model, a generalized boosted model, or a regression model, wherein the second predictive model generates the second plurality of scored results based on the first plurality of scored results generated by the first predictive model, and wherein the first plurality of scored results comprises insurance premiums or insurance policy renewal rates;
compile and write the first plurality of scored results into the first additional column of the results table and the second plurality of scored results into the second additional column of the results table; and
output the results table, comprising the first additional column and the second additional column, to at least one of the one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, apparatuses, and systems for applying models to score insurance data are disclosed. In one aspect, a system comprising a master node, a plurality of nodes in at least one cluster connected to the master node, and one or more computing devices connected to the master node is disclosed, where the master node is configured to distribute, using a HIVE module, an insurance scoring script and a predictive model to each of the plurality of nodes. The master node may call a function of the HIVE module to instruct each of the plurality of nodes to execute the insurance scoring script to generate scored results, wherein the scored results are written into a results table, and wherein the scored results comprise insurance scores for a plurality of customers.
28 Citations
19 Claims
-
1. A system comprising:
-
a master node comprising a memory; a plurality of nodes in at least one cluster connected to the master node; and one or more computing devices connected to the master node, wherein the master node is configured to; create an input table in the memory; load, into the input table, insurance data associated with a plurality of customers; create a first file directory path to an insurance scoring script stored on at least one of the one or more computing devices, a second file directory path to a first predictive model stored on at least one of the one or more computing devices, and a third file directory path to a second predictive model stored on at least one of the one or more computing devices; create a results table in the memory, the results table including the input table, a first additional column, and a second additional column; assign, using a complementary group rating model and the insurance data, the plurality of customers to a respective tier among a plurality of tiers; call a function of a Hive module that; divides the insurance data into a plurality of data portions, wherein the insurance data is divided into the plurality of data portions based on a respective tier such that each of the plurality of data portions comprises a portion of the insurance data that is associated with one or more customers in a same tier, and wherein the master node delivers each of the plurality of data portions to a respective one of the plurality of nodes; and instructs each of the plurality of nodes to execute, in parallel, the insurance scoring script to sequentially apply the first predictive model and the second predictive model to a specific portion of the plurality of data portions to generate a first plurality of scored results and a second plurality of scored results, respectively, wherein the first predictive model and the second predictive model are different, wherein at least one of the first predictive model and the second predictive model comprises a generalized linear model, a generalized boosted model, or a regression model, wherein the second predictive model generates the second plurality of scored results based on the first plurality of scored results generated by the first predictive model, and wherein the first plurality of scored results comprises insurance premiums or insurance policy renewal rates; compile and write the first plurality of scored results into the first additional column of the results table and the second plurality of scored results into the second additional column of the results table; and output the results table, comprising the first additional column and the second additional column, to at least one of the one or more computing devices. - View Dependent Claims (2, 3, 4, 11, 12, 13)
-
-
5. A method comprising:
-
creating, by a computer processor of a master node, an input table in a memory of the master node; loading, into the input table, insurance data of a plurality of customers; creating a first file directory path to an insurance scoring script, a second file directory path to a first predictive model, and a third file directory path to a second predictive model; creating, by the computer processor of the master node, a results table in the memory, the results table including the input table, a first additional column, and a second additional column; calling a function of a Hive module that divides the insurance data into a plurality of data portions and instructs each of a plurality of nodes in at least one cluster to execute, in parallel, the insurance scoring script that sequentially applies the first predictive model and the second predictive model to a specific portion of the plurality of data portions to generate a first plurality of scored results and a second plurality of scored results, respectively, wherein the insurance scoring script applies the first predictive model to generate insurance premiums or insurance policy renewal rates as the first plurality of scored results, wherein the first predictive model and second predictive model are different, wherein at least one of the first predictive model and the second predictive model comprises a generalized linear model, a generalized boosted model, or a regression model, wherein the second predictive model generates the second plurality of scored results based on the first plurality of scored results generated by the first predictive model, and wherein execution, in parallel, of the insurance scoring script comprises; simultaneously loading, by a first node among the plurality of nodes and a second node among the plurality of nodes, the first predictive model; simultaneously defining, by the first node and the second node, first model parameters for the first predictive model; and simultaneously generating, by the first node, a calculated score for each customer of a first portion of the plurality of data portions and generating, by the second node, a calculated score for each customer of a second portion of the plurality of data portions; compiling and writing, by the computer processor of the master node, the first plurality of scored results into the first additional column of the results table and the second plurality of scored results into the second additional column of the results table; and outputting the results table, comprising the first additional column and the second additional column, to one or more computing devices. - View Dependent Claims (6, 7, 8, 9, 10)
-
-
14. A system comprising:
-
a master node comprising a memory; a plurality of nodes in at least one cluster connected to the master node; and one or more computing devices connected to the master node, wherein the master node is configured to; create an input table in the memory; load, into the input table, insurance data of a plurality of customers; create a first file directory path to an insurance scoring script, a second file directory path to a first predictive model, and a third file directory path to a second predictive model; create a results table in the memory, the results table including the input table, a first additional column, and a second additional column; call a function of a Hive module that divides the insurance data into a plurality of data portions and instructs each of the plurality of nodes to execute, in parallel, the insurance scoring script that sequentially applies the first predictive model and the second predictive model to a specific portion of the plurality of data portions to generate a first plurality of scored results and a second plurality of scored results, respectively, wherein the insurance scoring script applies the first predictive model to generate insurance premiums or insurance policy renewal rates as the first plurality of scored results, wherein the first predictive model and second predictive model are different, wherein at least one of the first predictive model and the second predictive model comprises a generalized linear model, a generalized boosted model, or a regression model, wherein the second predictive model generates the second plurality of scored results based on the first plurality of scored results generated by the first predictive model, and wherein execution, in parallel, of the insurance scoring script comprises; simultaneously loading, by a first node among the plurality of nodes and a second node among the plurality of nodes, the first predictive model; simultaneously defining, by the first node and the second node, first model parameters for the first predictive model; and simultaneously generating, by the first node, a calculated score for each customer of a first portion of the plurality of data portions and generating, by the second node, a calculated score for each customer of a second portion of the plurality of data portions; compile and write the first plurality of scored results into the first additional column of the results table and the second plurality of scored results into the second additional column of the results table; and output the results table, comprising the first additional column and the second additional column, to one or more computing devices. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification