×

System and method for aggregating and ranking data from a plurality of web sites

  • US 8,880,498 B2
  • Filed: 09/27/2009
  • Issued: 11/04/2014
  • Est. Priority Date: 12/31/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for automatically collecting data from a plurality of targeted web sites to aggregate said data;

  • the method comprising a plurality of stages;

    automatically and periodically querying for said data from a plurality of related sites, said related sites comprising at least one web page that was not previously analyzed;

    analyzing the results from said querying, said results comprising at least one webpage, said analyzing comprising;

    geometrical analyzing of a page layout of the webpage, wherein said geometrical analyzing comprises determining one or more geometrical properties of the webpage, wherein said determining one or more geometrical properties comprises decomposing said page layout of the document into a plurality of layout subareas to render said page layout to form a rendered layout, determining one or more rectangles in each of said layout subarea, and determining height, width and position of each of said rectangles to form said geometrical properties of said rendered layout;

    locating recurring patterns of said rectangles in said rendered layout;

    searching for a plurality of record containers within said recurring patterns of said rectangles according to said layout subareas wherein said record containers are defined as having an organized inner structure of said rectangles;

    selecting a record container to form a selected record container;

    semantically analyzing a record from said selected record container to form a previously semantically analyzed record if a previously semantically analyzed record is not stored;

    determining a relevancy of a record to form a relevant record from said selected record container according to said one or more geometrical properties by comparing said recurring patterns of said rectangles and said organized inner rectangles of records to said recurring patterns of said rectangles and said organized inner rectangles of records of a previously semantically analyzed relevant record;

    storing the relevant record data in an aggregated data base to aggregate said data;

    storing said recurring patterns of rectangles to form stored recurring patterns of rectangles;

    comparing recurring patterns of rectangles on said at least one webpage that was not previously analyzed to said stored recurring patterns of rectangles to search for a match;

    if no match is found, performing said above stages of the method for said at least one webpage that was not previously analyzed; and

    retrieving said data from said aggregated data base, upon demand from user.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×