Scoring of insurance data

US 10,672,078 B1
Filed: 05/19/2014
Issued: 06/02/2020
Est. Priority Date: 05/19/2014
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

a master node comprising a memory;

a plurality of nodes in at least one cluster connected to the master node; and

one or more computing devices connected to the master node,wherein the master node is configured to;

create an input table in the memory;

load, into the input table, insurance data associated with a plurality of customers;

create a first file directory path to an insurance scoring script stored on at least one of the one or more computing devices, a second file directory path to a first predictive model stored on at least one of the one or more computing devices, and a third file directory path to a second predictive model stored on at least one of the one or more computing devices;

create a results table in the memory, the results table including the input table, a first additional column, and a second additional column;

assign, using a complementary group rating model and the insurance data, the plurality of customers to a respective tier among a plurality of tiers;

call a function of a Hive module that;

divides the insurance data into a plurality of data portions, wherein the insurance data is divided into the plurality of data portions based on a respective tier such that each of the plurality of data portions comprises a portion of the insurance data that is associated with one or more customers in a same tier, and wherein the master node delivers each of the plurality of data portions to a respective one of the plurality of nodes; and

instructs each of the plurality of nodes to execute, in parallel, the insurance scoring script to sequentially apply the first predictive model and the second predictive model to a specific portion of the plurality of data portions to generate a first plurality of scored results and a second plurality of scored results, respectively, wherein the first predictive model and the second predictive model are different, wherein at least one of the first predictive model and the second predictive model comprises a generalized linear model, a generalized boosted model, or a regression model, wherein the second predictive model generates the second plurality of scored results based on the first plurality of scored results generated by the first predictive model, and wherein the first plurality of scored results comprises insurance premiums or insurance policy renewal rates;

compile and write the first plurality of scored results into the first additional column of the results table and the second plurality of scored results into the second additional column of the results table; and

output the results table, comprising the first additional column and the second additional column, to at least one of the one or more computing devices.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, apparatuses, and systems for applying models to score insurance data are disclosed. In one aspect, a system comprising a master node, a plurality of nodes in at least one cluster connected to the master node, and one or more computing devices connected to the master node is disclosed, where the master node is configured to distribute, using a HIVE module, an insurance scoring script and a predictive model to each of the plurality of nodes. The master node may call a function of the HIVE module to instruct each of the plurality of nodes to execute the insurance scoring script to generate scored results, wherein the scored results are written into a results table, and wherein the scored results comprise insurance scores for a plurality of customers.

28 Citations

View as Search Results

19 Claims

1. A system comprising:
- a master node comprising a memory;
  
  a plurality of nodes in at least one cluster connected to the master node; and
  
  one or more computing devices connected to the master node,wherein the master node is configured to;
  
  create an input table in the memory;
  
  load, into the input table, insurance data associated with a plurality of customers;
  
  create a first file directory path to an insurance scoring script stored on at least one of the one or more computing devices, a second file directory path to a first predictive model stored on at least one of the one or more computing devices, and a third file directory path to a second predictive model stored on at least one of the one or more computing devices;
  
  create a results table in the memory, the results table including the input table, a first additional column, and a second additional column;
  
  assign, using a complementary group rating model and the insurance data, the plurality of customers to a respective tier among a plurality of tiers;
  
  call a function of a Hive module that;
  
  divides the insurance data into a plurality of data portions, wherein the insurance data is divided into the plurality of data portions based on a respective tier such that each of the plurality of data portions comprises a portion of the insurance data that is associated with one or more customers in a same tier, and wherein the master node delivers each of the plurality of data portions to a respective one of the plurality of nodes; and
  
  instructs each of the plurality of nodes to execute, in parallel, the insurance scoring script to sequentially apply the first predictive model and the second predictive model to a specific portion of the plurality of data portions to generate a first plurality of scored results and a second plurality of scored results, respectively, wherein the first predictive model and the second predictive model are different, wherein at least one of the first predictive model and the second predictive model comprises a generalized linear model, a generalized boosted model, or a regression model, wherein the second predictive model generates the second plurality of scored results based on the first plurality of scored results generated by the first predictive model, and wherein the first plurality of scored results comprises insurance premiums or insurance policy renewal rates;
  
  compile and write the first plurality of scored results into the first additional column of the results table and the second plurality of scored results into the second additional column of the results table; and
  
  output the results table, comprising the first additional column and the second additional column, to at least one of the one or more computing devices.
- View Dependent Claims (2, 3, 4, 11, 12, 13)
- - 2. The system of claim 1,wherein the master node is further configured to receive a comma separated values file comprising the insurance data and parse the insurance data in the comma separated values file prior to loading the insurance data into the input table, andwherein the insurance data comprises insurance factors and a plurality of policy numbers corresponding to the plurality of customers.
  - 3. The system of claim 2, wherein the insurance factors comprise at least one of gender, age, area, income, type of home, cost of home, vehicle type, or claim history for each of the plurality of customers.
  - 4. The system of claim 1, wherein the insurance premiums or the insurance policy renewal rates are for a particular tier of the plurality of customers.
  - 11. The system of claim 1,wherein the first predictive model generates an insurance premium for each of the plurality of customers,wherein the second predictive model generates a renewal rate for each of the plurality of customers.
  - 12. The system of claim 11, wherein the first predictive model comprises the generalized boosted model, and wherein the insurance scoring script defines a number of trees for the first predictive model.
  - 13. The system of claim 1, wherein the first plurality of scored results comprises insurance premiums and the second plurality of scored results comprises renewal rates.

5. A method comprising:
- creating, by a computer processor of a master node, an input table in a memory of the master node;
  
  loading, into the input table, insurance data of a plurality of customers;
  
  creating a first file directory path to an insurance scoring script, a second file directory path to a first predictive model, and a third file directory path to a second predictive model;
  
  creating, by the computer processor of the master node, a results table in the memory, the results table including the input table, a first additional column, and a second additional column;
  
  calling a function of a Hive module that divides the insurance data into a plurality of data portions and instructs each of a plurality of nodes in at least one cluster to execute, in parallel, the insurance scoring script that sequentially applies the first predictive model and the second predictive model to a specific portion of the plurality of data portions to generate a first plurality of scored results and a second plurality of scored results, respectively, wherein the insurance scoring script applies the first predictive model to generate insurance premiums or insurance policy renewal rates as the first plurality of scored results, wherein the first predictive model and second predictive model are different, wherein at least one of the first predictive model and the second predictive model comprises a generalized linear model, a generalized boosted model, or a regression model, wherein the second predictive model generates the second plurality of scored results based on the first plurality of scored results generated by the first predictive model, and wherein execution, in parallel, of the insurance scoring script comprises;
  
  simultaneously loading, by a first node among the plurality of nodes and a second node among the plurality of nodes, the first predictive model;
  
  simultaneously defining, by the first node and the second node, first model parameters for the first predictive model; and
  
  simultaneously generating, by the first node, a calculated score for each customer of a first portion of the plurality of data portions and generating, by the second node, a calculated score for each customer of a second portion of the plurality of data portions;
  
  compiling and writing, by the computer processor of the master node, the first plurality of scored results into the first additional column of the results table and the second plurality of scored results into the second additional column of the results table; and
  
  outputting the results table, comprising the first additional column and the second additional column, to one or more computing devices.
- View Dependent Claims (6, 7, 8, 9, 10)
- - 6. The method of claim 5, wherein the writing comprises inserting all calculated scores into the first additional column of the results table.
  - 7. The method of claim 5,wherein the insurance data is divided into the plurality of data portions according to a number of the plurality of nodes in the at least one cluster.
  - 8. The method of claim 7, wherein dividing the insurance data is further based on a total file size of the insurance data.
  - 9. The method of claim 5, wherein the function of the Hive module comprises a Hive Transform function.
  - 10. The method of claim 5, wherein the insurance premiums or the insurance policy renewal rates are for a particular tier of the plurality of customers.

14. A system comprising:
- a master node comprising a memory;
  
  a plurality of nodes in at least one cluster connected to the master node; and
  
  one or more computing devices connected to the master node,wherein the master node is configured to;
  
  create an input table in the memory;
  
  load, into the input table, insurance data of a plurality of customers;
  
  create a first file directory path to an insurance scoring script, a second file directory path to a first predictive model, and a third file directory path to a second predictive model;
  
  create a results table in the memory, the results table including the input table, a first additional column, and a second additional column;
  
  call a function of a Hive module that divides the insurance data into a plurality of data portions and instructs each of the plurality of nodes to execute, in parallel, the insurance scoring script that sequentially applies the first predictive model and the second predictive model to a specific portion of the plurality of data portions to generate a first plurality of scored results and a second plurality of scored results, respectively, wherein the insurance scoring script applies the first predictive model to generate insurance premiums or insurance policy renewal rates as the first plurality of scored results, wherein the first predictive model and second predictive model are different, wherein at least one of the first predictive model and the second predictive model comprises a generalized linear model, a generalized boosted model, or a regression model, wherein the second predictive model generates the second plurality of scored results based on the first plurality of scored results generated by the first predictive model, and wherein execution, in parallel, of the insurance scoring script comprises;
  
  simultaneously loading, by a first node among the plurality of nodes and a second node among the plurality of nodes, the first predictive model;
  
  simultaneously defining, by the first node and the second node, first model parameters for the first predictive model; and
  
  simultaneously generating, by the first node, a calculated score for each customer of a first portion of the plurality of data portions and generating, by the second node, a calculated score for each customer of a second portion of the plurality of data portions;
  
  compile and write the first plurality of scored results into the first additional column of the results table and the second plurality of scored results into the second additional column of the results table; and
  
  output the results table, comprising the first additional column and the second additional column, to one or more computing devices.
- View Dependent Claims (15, 16, 17, 18, 19)
- - 15. The system of claim 14, wherein writing the first plurality of scored results comprises inserting all calculated scores into the first additional column of the results table.
  - 16. The system of claim 14,wherein the insurance data is divided into the plurality of data portions according to a number of the plurality of nodes in the at least one cluster.
  - 17. The system of claim 16, wherein dividing the insurance data is further based on a total file size of the insurance data.
  - 18. The system of claim 14, wherein the function of the Hive module comprises a Hive Transform function.
  - 19. The system of claim 14, wherein the insurance premiums or the insurance policy renewal rates are for a particular tier of the plurality of customers.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Allstate Insurance Company (Allstate Corporation)
Original Assignee
Allstate Insurance Company (Allstate Corporation)
Inventors
Tagny Diesse, Patrick Christian
Primary Examiner(s)
Jacob, William J

Application Number

US14/281,545
Time in Patent Office

2,206 Days
Field of Search

None
US Class Current
CPC Class Codes

G06Q 40/08 Insurance

Scoring of insurance data

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

28 Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Scoring of insurance data

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

28 Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links