×

System and method for content selection for web page indexing

  • US 10,303,722 B2
  • Filed: 05/05/2009
  • Issued: 05/28/2019
  • Est. Priority Date: 05/05/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for indexing a webpage comprising:

  • retrieving, by an indexer server, a plurality webpages to be indexed from a webpage data store, wherein the indexer server comprises one or more computer systems configured to performing indexing of webpage documents within the webpage data store;

    determining, by the indexer server, for each of the plurality of webpages, a document object model (DOM) containing one or more DOM elements within each webpage;

    computing, by the indexer server, a DOM element identifier for each of the one or more DOM elements within each of the plurality of webpages, wherein each DOM element identifier is computed based on the content within the corresponding DOM element;

    determining, by the indexer server, a first subset of the plurality of DOM elements having DOM element identifiers that satisfy a content similarity threshold to the DOM element identifiers of the other DOM elements;

    retrieving attention history data associated with each of the first subset of DOM elements, wherein the attention history data for each particular DOM element is based on previous user interface events detected within the particular DOM element;

    combining the attention history data associated with each of the first subset of DOM elements, and comparing the combined attention history data to an attention history threshold level; and

    in response to a determination that the combined attention history data associated with the first subset of DOM elements meets the attention history threshold level, indexing, by the indexer server, each of the first subset of DOM elements.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×