Small form factor web browsing
First Claim
1. One or more computer-readable storage media having computer-readable instructions thereon which, when executed by one or more processors, cause the one or more processors to perform a method comprising:
- extracting high level structure information about a web page using markup language tag tree selection rules to define high level boundaries;
extracting low level structure information within each high level structure information by;
identifying explicit visual boundaries from the properties of one or more tags of the markup language tag tree by clustering the one or more tags in a pattern detection algorithm enabling the detection of one or more regions within the high level structure information;
identifying visual units by boundary detection from the markup language tag tree corresponding to each high level structure information; and
projecting each visual unit perpendicularly to an axis to identify implicit boundaries between the visual units;
storing the high level boundaries, the implicit boundaries, and the extracted high and low level structure information of the web page using an annotation mechanism, andfor a predetermined display screen having a width, identifying a plurality of sub-pages of the web page using the high level boundaries, the implicit boundaries, and the extracted high and low level structure information of the web page, wherein each sub-page has a width not greater than the width of the predetermined display screen;
forming a thumbnail image of the web page having partitions corresponding to the one or more sub-pages;
forming a hyperlink to each sub-page from a corresponding partition of the thumbnail image of the web page; and
storing the sub-pages, the thumbnail image, and the hyperlinks using the annotation mechanism.
2 Assignments
0 Petitions
Accused Products
Abstract
A large web page is analyzed and partitioned into smaller sub-pages so that a user can navigate the web page on a small form factor device. The user can browse the sub-pages to find and read information in the content of the large web page. The partitioning can be performed at a web server, an edge server, at the small form factor device, or can be distributed across one or more such devices. The analysis leverages design habits of a web page author to extract a representation structure of an authored web page. The extracted representation structure includes high level structure using several markup language tag selection rules and low level structure using visual boundary detection in which visual units of the low level structure are provided by clustering markup language tags. User viewing habits can be learned to display favorite parts of a web page.
-
Citations
9 Claims
-
1. One or more computer-readable storage media having computer-readable instructions thereon which, when executed by one or more processors, cause the one or more processors to perform a method comprising:
-
extracting high level structure information about a web page using markup language tag tree selection rules to define high level boundaries; extracting low level structure information within each high level structure information by; identifying explicit visual boundaries from the properties of one or more tags of the markup language tag tree by clustering the one or more tags in a pattern detection algorithm enabling the detection of one or more regions within the high level structure information; identifying visual units by boundary detection from the markup language tag tree corresponding to each high level structure information; and projecting each visual unit perpendicularly to an axis to identify implicit boundaries between the visual units; storing the high level boundaries, the implicit boundaries, and the extracted high and low level structure information of the web page using an annotation mechanism, and for a predetermined display screen having a width, identifying a plurality of sub-pages of the web page using the high level boundaries, the implicit boundaries, and the extracted high and low level structure information of the web page, wherein each sub-page has a width not greater than the width of the predetermined display screen; forming a thumbnail image of the web page having partitions corresponding to the one or more sub-pages; forming a hyperlink to each sub-page from a corresponding partition of the thumbnail image of the web page; and storing the sub-pages, the thumbnail image, and the hyperlinks using the annotation mechanism. - View Dependent Claims (2, 3)
-
-
4. A method comprising:
-
using a markup language tag tree of the web page to extract regions of the web page including a header, a footer, left and right side bar regions, and one or more body regions encompassed by the header region, the footer region, the left side bar region, and the right side bar region; identifying visual boundaries within each region by; arranging the properties of the tags of the markup language tag tree by clustering the tags in a pattern detection algorithm enabling the detection of the regions of the web page; projecting, normal to an axis, each shape represented by one or more semantic units of the tags of the markup language tag tree in each region; determining the additional visual boundaries from projection values on the axis; and for a predetermined display screen, identifying a plurality of sub-pages of the web page using the identified visual boundaries and the regions, where each sub-page has a width not greater than the width of the predetermined display screen; forming a thumbnail image of the web page having partitions corresponding to the one or more sub-pages; forming a hyperlink to each sub-page from a corresponding partition of the thumbnail image of the web page; and storing the sub-pages, the thumbnail image, and the hyperlinks using an annotation mechanism. - View Dependent Claims (5, 6)
-
-
7. A method comprising:
-
analyzing a markup language tag tree of a web page to identify; peripheral regions of the web page including header, footer, left, and right regions; and one or more body regions adjacent to at least one peripheral region; within the markup language tag tree that defines each peripheral and body region; identifying visual boundaries given in the properties of the tags of the markup language tag tree by clustering the tags of the markup tag tree in a pattern detection algorithm enabling the detection of the peripheral regions of the web page and the one or more body regions of the web page; and identifying blank areas by; analyzing one or more functions on the basis of a layout structure of each function by; configuring each function into a rectangle; projecting each rectangle normally onto each of perpendicular axes; determining one or more separators that are each normal to the axes as a function of a sum of projections on each axis where the quantity of the projections is less than a predetermined threshold; and for a predetermined width of a display screen, identifying a plurality of sub-pages of the web page each having a width not greater than the predetermined width of the display screen and using; the peripheral regions of the web page including header, footer, left, and right regions; the one or more body regions adjacent to at least one peripheral region; the visual boundaries identified in the properties of the tags of the markup language tag tree; the one more separators; and forming a thumbnail image of the web page having partitions corresponding to the one or more sub-pages; forming a hyperlink to each sub-page from a corresponding partition of the thumbnail image of the web page; and storing the sub-pages, the thumbnail image, and the hyperlinks using an annotation mechanism. - View Dependent Claims (8, 9)
-
Specification