System and method for incremental annotation of datasets
First Claim
Patent Images
1. A system for incremental annotation of datasets, the system comprising:
- at least one storage device configured to store a plurality of labeled examples and a plurality of unlabeled examples; and
at least one processor configured to;
use the plurality of labeled examples to generate a first inference model;
use the first inference model to assign labels to at least part of the unlabeled examples;
calculate confidence levels corresponding to the assigned labels;
use the confidence levels to select a subset of the plurality of unlabeled examples, where at least one of the plurality of unlabeled examples is not included in the selected subset of the plurality of unlabeled examples;
generate a second inference model based on the plurality of labeled examples, the selected subset of the plurality of unlabeled examples, and the assigned labels corresponding to the selected subset of the plurality of unlabeled examples;
use the confidence levels to select a user of a plurality of alternative users;
provide a request to the selected user to assign labels;
in response to the request, receive from the user an assignment of labels to one or more of the unlabeled examples; and
generate a third inference model based on the plurality of labeled examples, the one or more of the unlabeled examples, and the assignment of labels received from the user.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for incremental annotation of datasets are provided. For example, a group of labeled examples and a group of unlabeled examples may be obtained, a first inference model may be generated using the group of labeled examples, labels may be assigned to at least part of the group of unlabeled examples using the first inference model, confidence levels may be assigned to the assigned labels, a subset of the group of unlabeled examples may be selected using the confidence levels, and in some cases a second inference model may be generated using the selected subset and/or the corresponding assigned labels.
-
Citations
18 Claims
-
1. A system for incremental annotation of datasets, the system comprising:
-
at least one storage device configured to store a plurality of labeled examples and a plurality of unlabeled examples; and at least one processor configured to; use the plurality of labeled examples to generate a first inference model; use the first inference model to assign labels to at least part of the unlabeled examples; calculate confidence levels corresponding to the assigned labels; use the confidence levels to select a subset of the plurality of unlabeled examples, where at least one of the plurality of unlabeled examples is not included in the selected subset of the plurality of unlabeled examples; generate a second inference model based on the plurality of labeled examples, the selected subset of the plurality of unlabeled examples, and the assigned labels corresponding to the selected subset of the plurality of unlabeled examples; use the confidence levels to select a user of a plurality of alternative users; provide a request to the selected user to assign labels; in response to the request, receive from the user an assignment of labels to one or more of the unlabeled examples; and generate a third inference model based on the plurality of labeled examples, the one or more of the unlabeled examples, and the assignment of labels received from the user. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for incremental annotation of datasets, the method comprising:
-
accessing a plurality of labeled examples and a plurality of unlabeled examples; using the plurality of labeled examples to generate a first inference model; using the first inference model to assign labels to at least part of the unlabeled examples; calculating confidence levels corresponding to the assigned labels; use the confidence levels to select a subset of the plurality of unlabeled examples, where at least one of the plurality of unlabeled examples is not included in the selected subset of the plurality of unlabeled examples; generating a second inference model based on the plurality of labeled examples, the selected subset of the plurality of unlabeled examples, and the assigned labels corresponding to the selected subset of the plurality of unlabeled examples; using the confidence levels to select a user of a plurality of alternative users; providing a request to the selected user to assign labels; in response to the request, receiving from the user an assignment of labels to one or more of the unlabeled examples; and generating a third inference model based on the plurality of labeled examples, the one or more of the unlabeled examples, and the assignment of labels received from the user. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A non-transitory computer readable medium storing data and computer implementable instructions for carrying out a method for incremental annotation of datasets, the method comprising:
-
accessing a plurality of labeled examples and a plurality of unlabeled examples; using the plurality of labeled examples to generate a first inference model; using the first inference model to assign labels to at least part of the unlabeled examples; calculating confidence levels corresponding to the assigned labels; use the confidence levels to select a subset of the plurality of unlabeled examples, where at least one of the plurality of unlabeled examples is not included in the selected subset of the plurality of unlabeled examples; generating a second inference model based on the plurality of labeled examples, the selected subset of the plurality of unlabeled examples, and the assigned labels corresponding to the selected subset of the plurality of unlabeled examples; using the confidence levels to select a user of a plurality of alternative users; providing a request to the selected user to assign labels; in response to the request, receiving from the user an assignment of labels to one or more of the unlabeled examples; and generating a third inference model based on the plurality of labeled examples, the one or more of the unlabeled examples, and the assignment of labels received from the user.
-
Specification