Systems and methods for intelligently curating machine learning training data and improving machine learning model performance
First Claim
1. A system for intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system, the system comprising:
- one or more remote sources of machine learning training data;
one or more hardware computing servers implementing an artificially intelligent dialogue platform that;
constructs a corpora of machine learning test corpus that comprise a plurality of historical queries and/or historical commands test sampled from one or more production logs of a deployed dialogue system;
configures one or more training data sourcing parameters to source a corpora of raw machine learning training data from one or more remote sources of machine learning training data;
transmits, via one or more communication networks, the one or more training data sourcing parameters to the one or more remote sources of machine learning training data and collects, via the one or more communication networks, the corpora of raw machine learning training data;
calculates, using the one or more hardware computing servers, one or more efficacy metrics of the corpora of raw machine learning training data, wherein calculating the one or more efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data;
identifies whether to train at least one machine learning classifier of the artificially intelligent dialogue system based on one or more the coverage metric value and the diversity metric value of the corpora of raw machine learning;
uses the corpora of raw machine learning training data, as machine learning training input, to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold; and
responsive to training the at least one machine learning classifier using the corpora of raw machine learning training data, deploys the at least one machine learning classifier into a live implementation of the artificially intelligent dialogue system.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods of intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system includes constructing a corpora of machine learning test corpus that comprise a plurality of historical queries and commands sampled from production logs of a deployed dialogue system; configuring training data sourcing parameters to source a corpora of raw machine learning training data from remote sources of machine learning training data; calculating efficacy metrics of the corpora of raw machine learning training data, wherein calculating the efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data; using the corpora of raw machine learning training data to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold.
-
Citations
19 Claims
-
1. A system for intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system, the system comprising:
-
one or more remote sources of machine learning training data; one or more hardware computing servers implementing an artificially intelligent dialogue platform that; constructs a corpora of machine learning test corpus that comprise a plurality of historical queries and/or historical commands test sampled from one or more production logs of a deployed dialogue system; configures one or more training data sourcing parameters to source a corpora of raw machine learning training data from one or more remote sources of machine learning training data; transmits, via one or more communication networks, the one or more training data sourcing parameters to the one or more remote sources of machine learning training data and collects, via the one or more communication networks, the corpora of raw machine learning training data; calculates, using the one or more hardware computing servers, one or more efficacy metrics of the corpora of raw machine learning training data, wherein calculating the one or more efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data; identifies whether to train at least one machine learning classifier of the artificially intelligent dialogue system based on one or more the coverage metric value and the diversity metric value of the corpora of raw machine learning; uses the corpora of raw machine learning training data, as machine learning training input, to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold; and responsive to training the at least one machine learning classifier using the corpora of raw machine learning training data, deploys the at least one machine learning classifier into a live implementation of the artificially intelligent dialogue system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method of intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system, the method comprising:
an artificially intelligent dialogue platform implemented by one or more hardware computing servers; constructing a corpora of machine learning test corpus that comprise a plurality of historical queries and/or historical commands test sampled from one or more production logs of a deployed dialogue system; configuring one or more training data sourcing parameters to source a corpora of raw machine learning training data from one or more remote sources of machine learning training data; transmitting the one or more training data sourcing parameters to the one or more remote sources of machine learning training data and collecting the corpora of raw machine learning training data; calculating one or more efficacy metrics of the corpora of raw machine learning training data, wherein calculating the one or more efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data; identifying whether to train at least one machine learning classifier of the artificially intelligent dialogue system based on one or more the coverage metric value and the diversity metric value of the corpora of raw machine learning; using the corpora of raw machine learning training data to train at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold; and once the at least one machine learning classifier is trained using the corpora of raw machine learning training data, deploying the at least one machine learning classifier into an online implementation of the artificially intelligent dialogue system.
Specification