×

Method for automatic wrapper repair

  • US 20060085468A1
  • Filed: 12/05/2005
  • Published: 04/20/2006
  • Est. Priority Date: 07/18/2002
  • Status: Active Grant
First Claim
Patent Images

1. A method of information extraction from a Web page using an initial wrapper which has become partially inoperative, comprising:

  • wherein the initial wrapper comprises an initial set of rules for extracting information and for assigning labels from a wrapper set of labels to the extracted information;

    extracting strings from the Web page parsed in forward direction using the initial set of rules;

    analyzing the extracted strings according to the initial set of rules for assigning labels associated with the wrapper;

    assigning labels to those strings which satisfy the label rules;

    extracting strings from the Web page in backward/(opposite) direction using the initial set of rules;

    analyzing the extracted strings according to the set of rules for assigning labels associated with the wrappers; and

    assigning labels to those unlabeled strings from which satisfy the label rules.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×