×

Eliminating noise in periodicals

  • US 9,251,228 B1
  • Filed: 04/21/2011
  • Issued: 02/02/2016
  • Est. Priority Date: 04/21/2011
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method, comprising:

  • preprocessing, by a server computer system, each item in a set of items using one or more rules, wherein preprocessing an item in the set of items comprises;

    determining that the item includes a print option; and

    using a version of the item associated with the print option instead of an alternate version;

    removing, by the server computer system, global noise from the set of items using semantic similarities across items in the set of items; and

    removing, by the server computer system, local noise in the item of the set of items, wherein removing the local noise in the item of the set of items comprises;

    determining an amount of content for a node associated with the item;

    calculating a content score for the node based on the amount of content;

    calculating a link density for the node based on a number of links in the node as a percentage of the content;

    calculating a local noise score for the node based on the content score and the link density; and

    removing the node responsive to a determination that the local noise score is above a threshold.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×