COMPUTER-AIDED NATURAL LANGUAGE ANNOTATION
First Claim
Patent Images
1. A method, performed using one or more processors, of generating annotated training data for training a natural language understanding system, comprising:
- generating, with the natural language understanding system running on one or more of the processors, a proposed annotation for each of a plurality of units of unannotated training data received via one or more input components;
calculating, with one or more of the processors, a confidence measure for each of the proposed annotations for a given unit of the training data;
displaying on an output component at least some of the proposed annotations in an order based on the confidence measures of the proposed annotations, with one or more input components providing one or more user-actuable inputs configured for verification of the proposed annotations and one or more user-actuable inputs configured for deletion of the proposed annotations;
responding, with one or more of the processors, to an input for verification of one of the proposed annotations by storing the verified annotation for the given unit of the training data; and
responding, with one or more of the processors, to an input for deletion of one of the proposed annotations by presenting on an output component at least some of the remaining proposed annotations in an order based on the confidence measures of the remaining proposed annotations.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention uses a natural language understanding system that is currently being trained to assist in annotating training data for training that natural language understanding system. Unannotated training data is provided to the system and the system proposes annotations to the training data. The user is offered an opportunity to confirm or correct the proposed annotations, and the system is trained with the corrected or verified annotations.
39 Citations
14 Claims
-
1. A method, performed using one or more processors, of generating annotated training data for training a natural language understanding system, comprising:
-
generating, with the natural language understanding system running on one or more of the processors, a proposed annotation for each of a plurality of units of unannotated training data received via one or more input components; calculating, with one or more of the processors, a confidence measure for each of the proposed annotations for a given unit of the training data; displaying on an output component at least some of the proposed annotations in an order based on the confidence measures of the proposed annotations, with one or more input components providing one or more user-actuable inputs configured for verification of the proposed annotations and one or more user-actuable inputs configured for deletion of the proposed annotations; responding, with one or more of the processors, to an input for verification of one of the proposed annotations by storing the verified annotation for the given unit of the training data; and responding, with one or more of the processors, to an input for deletion of one of the proposed annotations by presenting on an output component at least some of the remaining proposed annotations in an order based on the confidence measures of the remaining proposed annotations. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method, performed using one or more processors, of generating annotated training data for training a natural language understanding (NLU) system, comprising:
-
generating, with the NLU system running on one or more of the processors, a proposed annotation for each of a plurality of units of unannotated training data, each proposed annotation having a type; displaying on an output component at least some of the proposed annotations in an order based on the type of each of the proposed annotations, with one or more input components providing one or more user-actuable inputs configured for verification of the proposed annotations and one or more user-actuable inputs configured for deletion of the proposed annotations; responding, with one or more of the processors, to an input for verification of one of the proposed annotations by storing the verified annotation for the given unit of the training data; and responding, with one or more of the processors, to an input for deletion of one of the proposed annotations by presenting at least some of the remaining proposed annotations in an order based on the type of each of the remaining proposed annotations. - View Dependent Claims (7, 8, 9)
-
-
10. A method, performed using one or more processors, of generating annotated training data for training a natural language understanding (NLU) system employing a plurality of different natural language training techniques, comprising:
-
generating, with the natural language understanding system running on one or more of the one or more processors, a different proposed annotation with each of a plurality of the natural language training techniques for a unit of unannotated training data, to obtain a plurality of proposed annotations generated by the natural language training techniques used for that unit of unannotated training data; displaying on an output component one or more of the proposed annotations, with user actuable inputs for user rejection or user selection of each of the displayed proposed annotations; responding, with one or more of the one or more processors, to an input for user rejection of one of the proposed annotations by displaying on an output component one or more of any remaining proposed annotations; and responding, with one or more of the processors, to an input for user selection of one of the proposed annotations by storing the selected annotation for the given unit of the training data. - View Dependent Claims (11, 12, 13, 14)
-
Specification