Visual web page analysis system and method
First Claim
1. A visual web page analysis system for analyzing data of a web page based on vision, the system comprising:
- a processor;
an image analyzing program executed by a processer to enable a processer to load information of a web page and to segment content of the web page into a plurality of blocks based on at least a visual feature of the web page;
a block analyzing program executed by the processer to enable the processer to classify the plurality of blocks based on at least an attribute of each block;
a vision identifying program executed by the processer to enable the processer to compare at least a relative feature of each block to determine a function of each classified block on the web page; and
an output program executed by the processer to enable the processer to collect the plurality of blocks and their functions into an information interface and to output the information interface,wherein the processor provides an analyzed result shown on the information interface, andwherein the block analyzing program executed by the processer further enables the processer to receive a plurality of web page tags, to determine the attribute of each block in accordance with the following formulas for Degree of Picture Hyperlink (DoPH), Picture Text Ratio (PTR), Degree of Local Text Hyperlink (DoLTH), Text Ratio, and Degree of Local Picture Hyperlink (DoLPH);
(DoTH)=(number of text hyperlinks in the block)/(number of text tags in the block), where a text tag is any HTML grammar instruction that can be used to present texts;
(DoPH)=(number of picture hyperlinks in the block)/(number of picture tags in the block), where a picture tag is any HTML grammar instruction that can be used to present pictures;
Text Ratio=(number of characters in the block)/(number of characters in the web page);
(PTR)=(number of image tags in the block)/(number of text tags in the block), where the PTR is used to measure a pictures-versus-text ratio in the block;
(DoLTH)=(number of local text hyperlinks in the block)/(number of text hyperlinks in the block), where the local text hyperlinks are text hyperlinks linked to the same web domain; and
(DoLPH)=(number of local picture hyperlinks in the block)/(number of picture hyperlinks in the block), where the local picture hyperlinks are picture hyperlinks linked to the same web domain.
0 Assignments
0 Petitions
Accused Products
Abstract
A visual web page analysis system includes an image analyzing unit, a block analyzing unit, a vision identifying unit, and an output unit. The image analyzing unit loads information of a web page and segments content of the web page into a plurality of blocks based on a visual feature. The block analyzing unit classifies the blocks based on an attribute of each block. The vision identifying unit compares at least a relative feature of each block to determine a function of each block on the web page. The output unit collects the blocks and their functions into an information interface and outputs the information interface.
10 Citations
20 Claims
-
1. A visual web page analysis system for analyzing data of a web page based on vision, the system comprising:
-
a processor; an image analyzing program executed by a processer to enable a processer to load information of a web page and to segment content of the web page into a plurality of blocks based on at least a visual feature of the web page; a block analyzing program executed by the processer to enable the processer to classify the plurality of blocks based on at least an attribute of each block; a vision identifying program executed by the processer to enable the processer to compare at least a relative feature of each block to determine a function of each classified block on the web page; and an output program executed by the processer to enable the processer to collect the plurality of blocks and their functions into an information interface and to output the information interface, wherein the processor provides an analyzed result shown on the information interface, and wherein the block analyzing program executed by the processer further enables the processer to receive a plurality of web page tags, to determine the attribute of each block in accordance with the following formulas for Degree of Picture Hyperlink (DoPH), Picture Text Ratio (PTR), Degree of Local Text Hyperlink (DoLTH), Text Ratio, and Degree of Local Picture Hyperlink (DoLPH); (DoTH)=(number of text hyperlinks in the block)/(number of text tags in the block), where a text tag is any HTML grammar instruction that can be used to present texts; (DoPH)=(number of picture hyperlinks in the block)/(number of picture tags in the block), where a picture tag is any HTML grammar instruction that can be used to present pictures; Text Ratio=(number of characters in the block)/(number of characters in the web page); (PTR)=(number of image tags in the block)/(number of text tags in the block), where the PTR is used to measure a pictures-versus-text ratio in the block; (DoLTH)=(number of local text hyperlinks in the block)/(number of text hyperlinks in the block), where the local text hyperlinks are text hyperlinks linked to the same web domain; and (DoLPH)=(number of local picture hyperlinks in the block)/(number of picture hyperlinks in the block), where the local picture hyperlinks are picture hyperlinks linked to the same web domain. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 14)
-
-
12. A visual web page analysis method executed by a processor for analyzing data of a web page based on vision, comprising the steps of:
-
(a) loading information of a web page; (b) segmenting content of the web page into a plurality of blocks based on at least a visual feature of the web page; (c) classifying the plurality of blocks based on at least an attribute of each block; (d) comparing at least a relative feature of each block to determine a function of each classified block on the web page; and (e) collecting the plurality of blocks and their functions into an information interface, and outputting the information interface, wherein step (d) further includes;
receiving a plurality of web page tags, to determine the attribute of each block in accordance with the following formulas for Degree of Picture Hyperlink (DoPH), Picture Text Ratio (PTR), Degree of Local Text Hyperlink (DoLTH), Text Ratio, and Degree of Local Picture Hyperlink (DoLPH);(DoTH)=(number of text hyperlinks in the block)/(number of text tags in the block), where a text tag is any HTML grammar instruction that can be used to present texts; (DoPH)=(number of picture hyperlinks in the block)/(number of picture tags in the block), where a picture tag is any HTML grammar instruction that can be used to present pictures; Text Ratio=(number of characters in the block)/(number of characters in the web page); (PTR)=(number of image tags in the block)/(number of text tags in the block), where the PTR is used to measure a pictures-versus-texts ratio in the block; (DoLTH)=(number of local text hyperlinks in the block)/(number of text hyperlinks in the block), where the local text hyperlinks are text hyperlinks linked to the same web domain; and (DoLPH)=(number of local picture hyperlinks in the block)/(number of picture hyperlinks in the block), where the local picture hyperlinks are picture hyperlinks linked to the same web domain. - View Dependent Claims (13, 15, 16, 17, 18, 19, 20)
-
Specification