×

Method of generating a training object for training a machine learning algorithm

  • US 10,445,379 B2
  • Filed: 05/29/2017
  • Issued: 10/15/2019
  • Est. Priority Date: 06/20/2016
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method of generating a training object for training a machine learning algorithm, the training object including a digital training document and an assigned label, the method executable at a training server, the method comprising:

  • acquiring the digital training document to be used in the training;

    transmitting, via a communication network, the digital training document to a plurality of assessors, transmitting further including indicating a range of possible labels for the assessors to assess from the range of possible labels including at least a first possible label and a second possible label;

    obtaining from each of the plurality of assessors a selected label to form a pool of selected labels;

    generating a consensus label distribution based on the pool of selected labels, the consensus label distribution representing a range of perceived labels for the digital training document and an associated probability score for each of the perceived labels;

    the consensus label distribution being generated by aggregating an assessor-specific perceived label distribution for each assessor of the plurality of assessors, wherein;

    the assessor-specific perceived label distribution for a given assessor of the plurality of assessors, is determined by;

    determining, for each of the range of possible labels, an assessor-inherent probability score, the assessor-inherent probability score for a given one of the range of possible labels being indicative of the probability of the given one of the range of possible labels being selected by the given assessor;

    determining, for each of the range of possible labels, a conditional probability score, the conditional probability score for a given one of the range of possible labels being indicative of the probability of the given one of the range of possible labels being perceived as a most relevant label to the digital training document by the given assessor despite the given assessor having selected a different one of the range of possible labels; and

    obtaining the assessor-specific perceived label distribution by aggregating, for each of the range of possible labels, the assessor-inherent probability score and the conditional probability score for the given assessor;

    training the machine learning algorithm using the digital training document and the consensus label distribution.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×