Labeling data samples using objective questions
First Claim
Patent Images
1. In a computing environment, a method comprising:
- receiving a plurality of sets of answers from human judges to a set of objective questions regarding a data sample;
determining, by one or more processors, a label from each human judge for the data sample based upon a set of answers from each human judge using a label assignment algorithm to provide a set of labels for the data sample, wherein the label assignment algorithm is produced by mapping the set of objective questions to a set of guidelines for labeling;
determining a single label for the data sample using the set of labels; and
associating the single label with the data sample.
2 Assignments
0 Petitions
Accused Products
Abstract
Described is a technology for obtaining labeled sample data. Labeling guidelines are converted into binary yes/no questions regarding data samples. The questions and data samples are provided to judges who then answer the questions for each sample. The answers are input to a label assignment algorithm that associates a label with each sample based upon the answers. If the guidelines are modified and previous answers to the binary questions are maintained, at least some of the previous answers may be used in re-labeling the samples in view of the modification.
84 Citations
20 Claims
-
1. In a computing environment, a method comprising:
-
receiving a plurality of sets of answers from human judges to a set of objective questions regarding a data sample; determining, by one or more processors, a label from each human judge for the data sample based upon a set of answers from each human judge using a label assignment algorithm to provide a set of labels for the data sample, wherein the label assignment algorithm is produced by mapping the set of objective questions to a set of guidelines for labeling; determining a single label for the data sample using the set of labels; and associating the single label with the data sample. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. In a computing environment, a system comprising:
-
a memory; one or more processors coupled to the memory; and a label assignment mechanism, implemented on the one or more processors, and configured to assign a label from among a set of labels to a data sample, the label assignment mechanism configured to determine which label to assign to the data sample based upon a path of objective questions and a plurality of sets of answers to the objective questions posed to at least two human judges, wherein for each set of answers the label assignment mechanism is configured to traverse a tree-like structure of the objective questions in a depth first fashion and associate each path ending in a leaf node to one label from among the set of labels to generate a plurality of labels associated with the plurality of sets of answers, and wherein the determining includes identifying a single label for the data sample using the plurality of labels associated with the plurality of sets of answers. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. One or more computer storage devices having computer-executable instructions, which in response to execution by a computer, cause the computer to perform steps comprising:
-
providing samples and a set of binary questions to a plurality of judges; obtaining sets of answers to the set of binary questions from each of the plurality of judges with respect to each sample; identifying bad judges through inconsistent answers across the sets of answers; and using the sets of answers obtained for each sample and a label assignment algorithm to determine which label of a finite set of labels to associate with that sample for each set of answers in the sets of answers to produce a plurality of labels for that sample, wherein the label assignment algorithm is produced by mapping the set of binary questions to a set of guidelines for labeling; and determining a single label for that sample using the plurality of labels. - View Dependent Claims (18, 19, 20)
-
Specification