×

SELECTIVE CONTENT EXTRACTION

  • US 20120089903A1
  • Filed: 06/30/2009
  • Published: 04/12/2012
  • Est. Priority Date: 06/30/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for extracting web content, comprising:

  • detecting, within a web page, a hierarchical structure that includes a plurality of nodes;

    identifying potential article nodes from the plurality of nodes;

    selecting as an article node one of the identified potential article nodes with a highest rank in the hierarchical structure; and

    producing content extracted from the article node.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×