×

SYSTEMS AND METHODS FOR CONTENT EXTRACTION

  • US 20130326332A1
  • Filed: 05/23/2013
  • Published: 12/05/2013
  • Est. Priority Date: 03/30/2005
  • Status: Active Grant
First Claim
Patent Images

1. A method for automatically classifying a markup language text that is accessible at an Internet domain comprising:

  • (a) retrieving from one or more data repositories, data associated with the Internet domain;

    (b) computing a first identifier for the Internet domain based on at least the data associated with the Internet domain and the markup language text;

    (c) computing a measure of similarity between the computed first identifier and each of a first plurality of previously classified identifiers; and

    (d) assigning the markup language text a classification based on the computed measure of similarity between the computed first identifier and each of the first plurality of previously classified identifiers.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×