×

EXTRACTING PRINCIPAL CONTENT FROM WEB PAGES

  • US 20130124513A1
  • Filed: 07/31/2012
  • Published: 05/16/2013
  • Est. Priority Date: 11/10/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method of extracting principal content from Web pages, comprising:

  • identifying and classifying items on the Web page;

    building a list of candidates;

    calculating candidate scores;

    selecting a top score candidate;

    performing clean up processing for the top score candidate; and

    performing final page processing for the top score candidate.

View all claims
  • 10 Assignments
Timeline View
Assignment View
    ×
    ×