METHOD FOR PREVENTING THE EXTRACTION OF A MACHINE LEARNING MODEL
First Claim
1. A method comprising:
- during a training phase of operation, training a machine learning model using first training data having a first classification;
during the training phase of operation, training the machine learning model using second training data having a second classification, the second classification being different than the first classification;
determining, during an inference phase of operation of the machine learning model, if an input sample belongs to the first classification or to the second classification;
if the input sample is determined to belong to the second classification, the machine learning model outputting a notification; and
if the input sample is determined to belong to the first classification, the machine learning model predicting a property of the input sample.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and data processing system for detecting tampering of a machine learning model is provided. The method includes training a machine learning model. During a training operating period, a plurality of input values is provided to the machine learning model. In response to a predetermined invalid input value, the machine learning model is trained that a predetermined output value will be expected. The model is verified that it has not been tampered with by inputting the predetermined invalid input value during an inference operating period. If the expected output value is provided by the machine learning model in response to the predetermined input value, then the machine learning model has not been tampered with. If the expected output value is not provided, then the machine learning model has been tampered with. The method may be implemented using the data processing system.
10 Citations
20 Claims
-
1. A method comprising:
-
during a training phase of operation, training a machine learning model using first training data having a first classification; during the training phase of operation, training the machine learning model using second training data having a second classification, the second classification being different than the first classification; determining, during an inference phase of operation of the machine learning model, if an input sample belongs to the first classification or to the second classification; if the input sample is determined to belong to the second classification, the machine learning model outputting a notification; and if the input sample is determined to belong to the first classification, the machine learning model predicting a property of the input sample. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for protecting a machine learning model from extraction, the method comprising:
-
during a training phase of operation, training the machine learning model using normal training data, the normal training data for training the machine learning model to perform a predetermined task; during the training phase of operation, training the machine learning model using abnormal training data, the abnormal training data for training the machine learning model to identify an attempted extraction of the machine learning model; determining, during an inference phase of operation of the machine learning model, if an input sample is input to extract the machine learning model or if the input sample is input for performance of the predetermined task; if the input sample is determined by the model to be the attempted extraction, the machine learning model outputting a notification, and if the input sample is determined by the model to be related to the predetermined task, the machine learning model predicting a property of the input sample. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A method comprising:
-
providing a machine learning system having a first machine learning model and a second machine learning model, during a training phase of operation of the machine learning system, training the first machine learning model using normal training data, the normal training data for training the machine learning model to perform a predetermined task; during the training phase of operation, training the second machine learning model using abnormal training data, the abnormal training data for training the machine learning model to identify an attempted extraction of the machine learning model; determining, during an inference phase of operation of the machine learning system, if an input sample inputted to the second machine learning model is inputted to extract the first machine learning model or if the input sample is inputted for performance of the predetermined task; if the input sample is determined by the model to be inputted for the attempted extraction, the machine learning model outputting a notification, and if the input sample is determined by the model to be related to the predetermined task, the machine learning model predicting a property of the input sample. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification