System and method for prediction using synthetic features and gradient boosted decision tree
First Claim
Patent Images
1. A computer-implemented method comprising:
- the computer obtaining a set of data relating to a loan application, wherein the data includes an amount of loan requested and a transaction history of a loan applicant;
the computer determining a plurality of synthetic features by at least;
executing a plurality of machine learning algorithms that have been trained, each of the machine learning algorithms, when executed, receiving as an input at least some of the data and producing as an output a respective synthetic feature representing an initial probability of whether a loan default will occur, wherein at least two of the machine learning algorithms are different from each other and accept different inputs;
wherein a first one of the machine learning algorithms is implemented using a neural network and accepts the transaction history of the loan applicant as its input and outputs a first synthetic feature representing a first initial probability of whether the loan default will occur, and wherein a second one of the machine learning algorithms accepts the amount of loan requested as its input and outputs a second synthetic feature representing a second initial probability of whether the loan default will occur;
the computer executing a gradient boosted decision tree (GBDT) algorithm, the GBDT algorithm processing both;
(i) the synthetic features including the first synthetic feature and the second synthetic feature, and (ii) at least some of the data, and producing an output representing a final probability of whether the loan default will occur;
the computer generating an indication of whether or not to approve the loan based on whether a particular value is above or below a stored threshold, wherein the particular value is the final probability or is a function of the final probability.
2 Assignments
0 Petitions
Accused Products
Abstract
A machine learning system and method are disclosed in which a plurality of synthetic features are created from input data, and a gradient boosted decision tree algorithm is then executed by the computer to process both the synthetic features and at least some of the input data to produce an output that is a probability.
-
Citations
15 Claims
-
1. A computer-implemented method comprising:
-
the computer obtaining a set of data relating to a loan application, wherein the data includes an amount of loan requested and a transaction history of a loan applicant; the computer determining a plurality of synthetic features by at least;
executing a plurality of machine learning algorithms that have been trained, each of the machine learning algorithms, when executed, receiving as an input at least some of the data and producing as an output a respective synthetic feature representing an initial probability of whether a loan default will occur, wherein at least two of the machine learning algorithms are different from each other and accept different inputs;
wherein a first one of the machine learning algorithms is implemented using a neural network and accepts the transaction history of the loan applicant as its input and outputs a first synthetic feature representing a first initial probability of whether the loan default will occur, and wherein a second one of the machine learning algorithms accepts the amount of loan requested as its input and outputs a second synthetic feature representing a second initial probability of whether the loan default will occur;the computer executing a gradient boosted decision tree (GBDT) algorithm, the GBDT algorithm processing both;
(i) the synthetic features including the first synthetic feature and the second synthetic feature, and (ii) at least some of the data, and producing an output representing a final probability of whether the loan default will occur;the computer generating an indication of whether or not to approve the loan based on whether a particular value is above or below a stored threshold, wherein the particular value is the final probability or is a function of the final probability. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
a memory to store a set of data relating to a loan application, wherein the data includes an amount of loan requested and a transaction history of a loan applicant; a predictor to receive the data and to produce an output representing a final probability of whether a loan default will occur; the predictor including a plurality of learners that have been trained, each learner implementing a respective machine learning algorithm, and wherein at least two of the learners implement a machine learning algorithm different from one another that accepts inputs different from one another; the predictor configured to; determine a plurality of synthetic features by sending to each of the learners at least some of the data, and each of the learners outputting a respective synthetic feature representing an initial probability of whether the loan default will occur, wherein a first learner of the learners implements a neural network and accepts the transaction history of the loan applicant and outputs a first synthetic feature representing a first initial probability of whether the loan default will occur, and wherein a second learner of the learners accepts the amount of loan requested as its input and outputs a second synthetic feature representing a second initial probability of whether the loan default will occur; and execute a gradient boosted decision tree (GBDT) algorithm, the GBDT algorithm processing both;
(i) the synthetic features including the first synthetic feature and the second synthetic feature, and (ii) at least some of the data, and producing the output representing the final probability of whether the loan default will occur;wherein the system is configured to generate an indication of whether or not to approve the loan based on whether a particular value is above or below a stored threshold, wherein the particular value is the final probability or is a function of the final probability. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system comprising:
-
at least one processor; and memory having stored thereon processor-executable instructions that, when executed, cause the at least one processor to; obtain a set of data relating to a loan application, wherein the data includes an amount of loan requested and a transaction history of a loan applicant; determine a plurality of synthetic features by at least;
executing a plurality of machine learning algorithms that have been trained, each of the machine learning algorithms, when executed, receiving as an input at least some of the data and producing as an output a respective synthetic feature representing an initial probability of whether a loan default will occur, wherein at least two of the machine learning algorithms are different from each other and accept different inputs;
wherein a first one of the machine learning algorithms is implemented using a neural network and accepts the transaction history of the loan applicant and outputs a first synthetic feature representing a first initial probability of whether the loan default will occur, and wherein a second one of the machine learning algorithms accepts the amount of loan requested as its input and outputs a second synthetic feature representing a second initial probability of whether the loan default will occur; andexecute a gradient boosted decision tree (GBDT) algorithm, the GBDT algorithm processing both;
(i) the synthetic features including the first synthetic feature and the second synthetic feature, and (ii) at least some of the data, and producing an output representing a final probability of whether the loan default will occur;generate an indication of whether or not to approve the loan based on whether a particular value is above or below a stored threshold, wherein the particular value is the final probability or is a function of the final probability.
-
Specification