×

IMAGE PROCESSING OF WEBPAGES

  • US 20200134401A1
  • Filed: 10/26/2018
  • Published: 04/30/2020
  • Est. Priority Date: 10/26/2018
  • Status: Active Grant
First Claim
Patent Images

1. A system for automated feature extraction of webpages including machine process sable information, the system comprising:

  • a markup language engine configured to;

    identify a plurality of webpages, andprocess markup language of the plurality of webpages to determine that a subset of the plurality of webpages includes a target characteristic;

    a rendering engine configured to;

    determine, for a webpage of the subset, that a first image overlaps at least a portion of a second image in the webpage based at least on markup language of the webpage, andgenerate, for the webpage of the subset, an image of the webpage such that the portion of the second image is obscured by the first image; and

    a detection engine configured to;

    determine, for the webpage of the subset, at least one graphical feature of the webpage by processing the image of the webpage, the at least one graphical feature corresponding to the portion of the second image,determine, for the webpage of the subset, that the at least one graphical feature corresponds to graphical features of images of a different plurality of webpages associated with a target entity, andgenerating, responsive to the determination that the at least one graphical feature corresponds to the graphical features of images of the different plurality of webpages, an association between the webpage and the target entity for storage in a database.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×