Web-compatible electronic device, web page processing method, and program
First Claim
1. A web page processing method comprising:
- acquiring a first web page including display elements;
identifying a headline and a story body in the first web page bycreating an internal depiction of the first web page;
computing positions of the display elements based the internal depiction;
classifying closely related display elements together as clusters based on the computed positions of the display elements;
discriminating a cluster of headlines and story bodies from other clusters by determining a center-of-gravity line which is a vertical line that crosses the largest number of display elements in the internal depiction and judging clusters as one of left, right, or middle in reference to the center-of-gravity line, wherein the cluster of headlines and story bodies is middle with respect to the other clusters;
forming groups, a group including clusters having the same character attributes;
calculating an average number of characters per cluster for each of the groups; and
determining the headline and the story body, wherein a group having a low average is the headline and a group having a high average is the story body;
creating a second web page including the story body from the first web page; and
creating a third web page including the headline from the first web page, wherein the headline in the third web page is a link to the second web page containing the story body.
1 Assignment
0 Petitions
Accused Products
Abstract
A Web page having display elements such as a headline, a story body, subheads, and links to articles is obtained and rendered internally to obtain a position of each display element based on draw data. Each display element is classified into several clusters according to its position. Next, clusters having the same character attributes are classified as a group. A group having a high average of the number of characters within each of its clusters is determined as the story body and a group having a low average is determined as the headline. Then, individual pages including the story body and a top page including the headline, the subheads, and links to the story body pages are created. Therefore, the Web page is reconstructed as Web pages suitable for browsing in low-resolution display environments.
60 Citations
9 Claims
-
1. A web page processing method comprising:
-
acquiring a first web page including display elements; identifying a headline and a story body in the first web page by creating an internal depiction of the first web page; computing positions of the display elements based the internal depiction; classifying closely related display elements together as clusters based on the computed positions of the display elements; discriminating a cluster of headlines and story bodies from other clusters by determining a center-of-gravity line which is a vertical line that crosses the largest number of display elements in the internal depiction and judging clusters as one of left, right, or middle in reference to the center-of-gravity line, wherein the cluster of headlines and story bodies is middle with respect to the other clusters; forming groups, a group including clusters having the same character attributes; calculating an average number of characters per cluster for each of the groups; and determining the headline and the story body, wherein a group having a low average is the headline and a group having a high average is the story body; creating a second web page including the story body from the first web page; and creating a third web page including the headline from the first web page, wherein the headline in the third web page is a link to the second web page containing the story body. - View Dependent Claims (2, 3)
-
-
4. A web-compatible electronic device comprising:
-
means for acquiring a first web page including display elements; means for identifying a headline and a story body in the first web page comprising; means for creating an internal depiction of the first web page; means for computing positions of the display elements based the internal depiction; means for classifying closely related display elements together as clusters based on the computed positions of the display elements; means for discriminating a cluster of headlines and story bodies from other clusters by determining a center-of-gravity line which is a vertical line that crosses the largest number of display elements in the internal depiction and judging clusters as one of left, right, or middle in reference to the center-of-gravity line, wherein the cluster of headlines and story bodies is middle with respect to the other clusters; means for forming groups, a group including clusters having the same character attributes; means for calculating an average number of characters per cluster for each of the groups; and means for determining the headline and the story body, wherein a group having a low average is the headline and a group having a high average is the Story body; means for creating a second web page including the story body from the first web page; and means for creating a third web page including the headline from the first web page, wherein the headline in the third web page is a link to the second web page containing the story body. - View Dependent Claims (5, 6)
-
-
7. A computer readable medium having a computer program for executing a web page processing method, the method comprising:
-
acquiring a first web page including display elements; identifying a headline and a story body in the first web page by creating an internal depiction of the first web page; computing positions of the display elements based the internal depiction; classifying closely related display elements together as clusters based on the computed positions of the display elements; discriminating a cluster of headlines and story bodies from other clusters by determining a center-of-gravity line which is a vertical line that crosses the largest number of display elements in the internal depiction and judging clusters as one of left, right, or middle in reference to the center-of-gravity line, wherein the cluster of headlines and story bodies is middle with respect to the other clusters; forming groups, a group including clusters having the same character attributes; calculating an average number of characters per cluster for each of the groups; and determining the headline and the story body, wherein a group having a low average is the headline and a group having high average is the story body; creating a second web page including the story body from the first web page; and creating a third web page including the headline from the first web page, wherein the headline in the third web page is a link to the second web page containing the story body. - View Dependent Claims (8, 9)
-
Specification