System and media for simplifying web contents, and method thereof
First Claim
Patent Images
1. A method for simplifying web contents, said method comprising:
- requesting access to a target page, said target page comprising a web page;
acquiring said target page;
acquiring adjoining pages that adjoin said target page in accordance with a Document Object Model comprising image nodes and text nodes;
performing a difference operation to delete objects that are common among said target page and said adjoining pages from said target page to generate a simplified page, wherein said difference operation comprises calculating a significance of the objects included in said target page, wherein if said significance exceeds a predetermined threshold, said objects are not deleted even if said objects are common with the objects of said adjoining pages; and
audibly outputting said simplified page.
5 Assignments
0 Petitions
Accused Products
Abstract
A technique for the simplification of Web pages in order to access necessary information rapidly, when displaying or outputting Web pages using a small screen device or a voice browser. A method for simplifying Web contents includes the steps of acquiring a target page subject to simplification, acquiring adjoining pages that adjoin the target page and performing a difference operation to delete objects that are common among the target page and the adjoining pages from the target page to generate a simplified page.
-
Citations
25 Claims
-
1. A method for simplifying web contents, said method comprising:
-
requesting access to a target page, said target page comprising a web page; acquiring said target page; acquiring adjoining pages that adjoin said target page in accordance with a Document Object Model comprising image nodes and text nodes; performing a difference operation to delete objects that are common among said target page and said adjoining pages from said target page to generate a simplified page, wherein said difference operation comprises calculating a significance of the objects included in said target page, wherein if said significance exceeds a predetermined threshold, said objects are not deleted even if said objects are common with the objects of said adjoining pages; and audibly outputting said simplified page. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computerized system for simplifying web contents comprising a server computer and a user computer arranged in a network, said server computer comprising:
-
a first server element for acquiring a target page; a second server element for generating URLs of adjoining pages which are to be compared with said target page; a third server element for acquiring said adjoining pages in accordance with a Document Object Model comprising image nodes and text nodes; a fourth server element for comparing each object included in said target page and said adjoining pages; a fifth server element for determining commonality of said objects and deleting common objects from said target page to generate a simplified page; a computer-implemented module for calculating a significance of the objects included in said target page; a computer-implemented module for not deleting said objects if said significance exceeds a first threshold, even if said objects are common with the objects of said adjoining pages; and a computer-implemented module for deleting said object if said significance is less than a second threshold, or a content of said objects is an empty table element or list element; wherein said user computer comprising a user browser for audibly outputting said simplified page. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A program storage device readable by machine, tangibly embodying a program of instructions, which when executed by a machine, perform a method for simplifying web contents, said method comprising:
-
requesting access to a target page, said target page comprising a web page; acquiring a target page; acquiring adjoining pages tat adjoin said target page in accordance with a Document Object Model comprising image nodes and text nodes; performing a difference operation for deleting objects that are common among said target page and said adjoining pages mini said target page, wherein said difference operation comprises calculating a significance of the objects included in said target page, wherein if said significance exceeds a predetermined threshold, said objects are not deleted even if said objects are common with the objects of said adjoining pages; generating a simplified page; and audibly outputting said simplified page.
-
-
23. A method for simplifying web contents, said method comprising:
-
requesting access to a target page, said target page comprising a web page; acquiring said target page; acquiring adjoining pages that adjoin said target page in accordance with a Document Object Model comprising image nodes and text nodes, wherein said acquiring of adjoining pages further comprises; determining pages of URLs whose directory is common with a URL of said target page or a URLs of links included in said target page; determining pages of URLs whose parent directory is common with the URL of said target page or the URL of the links included in said target page;
ordetermining a top page of each directory under a root directory that includes the URL of said target page; prioritizing URLs of said adjoining pages, wherein said prioritizing is determined based on either or both of an edit distance between a URL of said target page and URLs of said adjoining pages, or a relevance among URLs based on a number of co-occurrences or a number of cross-references between said target page and said adjoining pages; performing a difference operation to delete objects that are common among said target page and said adjoining pages from said target page to generate a simplified page, wherein said performing uses DP marching to determine whether said objects are common, wherein said difference operation comprises calculating a significance of the objects included in said target page, wherein if said significance exceeds a predetermined threshold, said objects are not deleted even if said objects are common with the objects of said adjoining pages, wherein said calculating of the significance is represented by a sum of weighted feature values;
wherein said feature values comprising a character size of said objects, a numerical value assigned to fonts and other character attributes, a numerical value to identify whether said objects are a banner, a displacement value of said objects from a center of a screen, a number of keywords included in said objects, a numerical value assigned to information indicating whether said objects are added or updated, a ratio of updated characters of said objects, a numerical value assigned to information indicating whether said objects are one character, and a numerical value assigned to a tag class of said objects;deleting an object which has a significance less than said predetermined threshold included in simplified pages, or a table element or list element whose content is empty; performing a post-processing process comprising restoration of a list tide, restoration of information at the top of or on a side of table, movement of a form to a rearward of the page, or reference of annotation information; and audibly outputting said simplified page. - View Dependent Claims (24, 25)
-
Specification