×

Annotating HTML segments with functional labels

  • US 9,594,730 B2
  • Filed: 07/01/2010
  • Issued: 03/14/2017
  • Est. Priority Date: 07/01/2010
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method comprising:

  • processing a web page to determine a plurality of segments, wherein each segment from the plurality of segments includes one or more HTML elements;

    each machine-based classifier of a plurality of machine-based classifiers generating, based at least upon metadata associated with two or more segments from the plurality of segments that indicates one or more presentation features in the HTML elements of the two or more segments from the plurality of segments, a probability output for each segment of the two or more segments from the plurality of segments, wherein each functional category from the plurality of functional categories corresponds to a functional role of HTML elements in the web page;

    wherein each machine-based classifier from the plurality of machine-based classifiers corresponds to a functional category from the plurality of functional categories;

    assigning, based on the plurality of probability output, one or more functional categories to each segment of the two or more segments;

    a first application selecting a first set of functional categories from the plurality of functional categories;

    a second application that is different than the first application selecting a second set of functional categories from the plurality of functional categories, wherein the second set of functional categories does not include functional categories from the first set of functional categories;

    the first application selecting for processing, based upon the first set of functional categories and the functional categories assigned to the two or more segments, a first set of one or more segments from the two or more segments;

    the second application selecting for processing, based upon the second set of functional categories and the functional categories assigned to the two or more segments, a second set of one or more segments from the two or more segments, wherein the second set of one or more segments includes at least one segment that is not in the first set of one or more segments and the first set of one or more segments includes at least one segment that is not in the second set of one or more segments;

    the first application processing content contained in the first set of one or more segments and not processing content contained in the second set of one or more segments;

    the second application processing content contained in the second set of one or more segments and not processing content contained in the first set of one or more segments; and

    wherein the method is performed by one or more computing devices.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×