×

System for automatically annotating training data for a natural language understanding system

  • US 7,548,847 B2
  • Filed: 05/10/2002
  • Issued: 06/16/2009
  • Est. Priority Date: 05/10/2002
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of generating annotated training data to train a natural language understanding (NLU) system having one or more models, comprising:

  • generating a proposed annotation with the NLU system for each of one or more units of unannotated training data;

    displaying the proposed annotations for user verification or correction to obtain a user-confirmed annotation; and

    training the NLU system with the user-confirmed annotation; and

    displaying an indication of a volume of training data used to train a plurality of different portions of the one or more models of the natural language understanding system;

    wherein displaying the proposed annotations for user verification or correction comprises;

    receiving a user input indicative of a user-identified portion of the proposed annotation; and

    displaying a plurality of alternative proposed annotations for the user-identified portion;

    wherein the one or more models impose model constraints and wherein displaying the one or more alternative proposed annotations comprises displaying an alternative proposed annotation for the user-identified portion of data only if the alternative proposed annotation can lead to an overall annotation for the unit that is consistent with the model constraints;

    wherein the proposed annotation includes parent and child nodes and wherein displaying a plurality of alternative proposed annotations includes displaying a user actuable delete node input which, when actuated, deletes a child node, and a user actuable add node input which, when actuated, adds a child node, and displaying the plurality of alternative proposed annotations in response to a user deleting a child node associated with the user-identified portion of data;

    wherein displaying a plurality of alternative proposed annotations comprises displaying a portion of the unit of data not covered by the proposed annotation, and displaying a plurality of alternative proposed annotations for the portion of data not covered by the proposed annotation;

    wherein the user is enabled to select a segment of the portion of data not covered by the proposed annotation and wherein displaying alternative proposed annotations comprises displaying a plurality of one or more alternative proposed annotations for the user-selected segment; and

    wherein the user is enabled to select one of the alternative proposed annotations from among the plurality of alternative proposed annotations, and the user-selected alternative proposed annotation is incorporated into the annotated training data.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×