×

Extraction of anchor explanatory text by mining repeated patterns

  • US 7,627,571 B2
  • Filed: 03/31/2006
  • Issued: 12/01/2009
  • Est. Priority Date: 03/31/2006
  • Status: Active Grant
First Claim
Patent Images

1. A computing device with a computer-readable storage medium with instructions for identifying explanatory text for a display page, the computer-readable storage medium comprising:

  • a find repeated patterns component that identifies repeated patterns of elements within a display page by comparing elements of the display page to other elements of the display page, a repeated pattern having an anchor along with text associated, the anchor being an element that includes a reference to a referenced display page, wherein patterns of elements are considered to be repeated when the patterns have the same number of elements and the patterns have an edit distance that is within a threshold;

    a find dominant anchor component that finds a dominant anchor within a repeated pattern bywhen the repeated pattern includes multiple anchors,when only one anchor of the repeated pattern contains a block element and has text associated with the anchor, designating that anchor as the dominant anchor; and

    when more than one anchor of the repeated pattern contains a block element and has text associated with the anchor, designating no anchor as the dominant anchor;

    an extract text component that extracts from a repeated pattern the text associated with the dominant anchor, wherein the extracted text represents explanatory text for the referenced display page; and

    whereina summarization component generates a summary of a display page from the explanatory text extracted by the extract text component.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×