METHODS AND APPARATUS TO AUTOMATICALLY CRAWL THE INTERNET USING IMAGE ANALYSIS
First Claim
Patent Images
1. A method to visually identify components of a web page, comprising:
- rendering a web page to generate an image;
visually analyzing at least a portion of the image with a machine to detect a region containing a possible web page component;
automatically determining a type of the possible web page component; and
storing the web page component type and a location of the portion of the web page.
4 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus to automatically crawl the Internet using image analysis are disclosed. An example method to visually identify components of a web page includes rendering a web page in a web browser to generate an image, and visually analyzing at least a portion of the image with a machine to detect a region containing a possible web page component. The example method further includes automatically determining a type of the detected web page component and storing the web page component type and a location of the portion of the web page.
115 Citations
30 Claims
-
1. A method to visually identify components of a web page, comprising:
-
rendering a web page to generate an image; visually analyzing at least a portion of the image with a machine to detect a region containing a possible web page component; automatically determining a type of the possible web page component; and storing the web page component type and a location of the portion of the web page. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An apparatus to identify components in a web page, comprising:
-
an image generator to render an image of a web page based on web page information; an image analyzer to automatically analyze the image to detect a web page component in the image and to generate location information corresponding to a location of the web page component in the image; and a component identifier to automatically determine a type of the web page component by analyzing the image. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. An article of manufacture comprising instructions, which, upon execution, cause a machine to:
-
render a web page to generate an image; visually analyzing at least a portion of the image to detect a region containing a possible web page component; automatically determine a type of the detected web page component; and store the web page component type and a location of the portion of the web page. - View Dependent Claims (21, 23, 26, 27, 29)
-
-
22. (canceled)
-
24. (canceled)
-
25. (canceled)
-
28. (canceled)
-
30-52. -52. (canceled)
Specification