×

Systems and methods for intelligently curating machine learning training data and improving machine learning model performance

  • US 10,303,978 B1
  • Filed: 09/27/2018
  • Issued: 05/28/2019
  • Est. Priority Date: 03/26/2018
  • Status: Active Grant
First Claim
Patent Images

1. A system for intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system, the system comprising:

  • one or more remote sources of machine learning training data;

    one or more hardware computing servers implementing an artificially intelligent dialogue platform that;

    constructs a corpora of machine learning test corpus that comprise a plurality of historical queries and/or historical commands test sampled from one or more production logs of a deployed dialogue system;

    configures one or more training data sourcing parameters to source a corpora of raw machine learning training data from one or more remote sources of machine learning training data;

    transmits, via one or more communication networks, the one or more training data sourcing parameters to the one or more remote sources of machine learning training data and collects, via the one or more communication networks, the corpora of raw machine learning training data;

    calculates, using the one or more hardware computing servers, one or more efficacy metrics of the corpora of raw machine learning training data, wherein calculating the one or more efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data;

    identifies whether to train at least one machine learning classifier of the artificially intelligent dialogue system based on one or more the coverage metric value and the diversity metric value of the corpora of raw machine learning;

    uses the corpora of raw machine learning training data, as machine learning training input, to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold; and

    responsive to training the at least one machine learning classifier using the corpora of raw machine learning training data, deploys the at least one machine learning classifier into a live implementation of the artificially intelligent dialogue system.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×