Display annotation and layout processing
First Claim
1. An information processing method comprising:
- providing an annotation for multiple page files, including the steps of;
obtaining a plurality of page files from a web site;
generating a group of said page files, page layout structures of which are at least similar by analyzing said page files to introduce structural descriptive forms for said page layout structures and to assign characteristic values for said structural descriptive forms;
employing said structural descriptive forms and said characteristic values to calculate an inter-page distance representing a similarity of said page files; and
grouping said page files, of which said inter-page distance is equal to or smaller than a predetermined value;
providing a first annotation for an arbitrary page file in said group; and
correlating said first annotation with at least a part of other page files of said group;
wherein said step of correlating said first annotation with said other page files in said group includes the steps of;
determining whether said first annotation should be applied for the page files of said group;
adding a second annotation, when the determination is false, for an arbitrary page file of a page group consisting of page files with which said first annotation is not correlated;
correlating said second annotation with at least a part of other page files of said page group; and
correcting a calculation expression for said inter-page distance, so that, at said step of generating a group, said page file with which said first annotation is correlated and said page files that are correlated with said second annotation do not fall in the same group.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides improvement of operations providing annotation and layout for an HTML page file. In an example embodiment, a page acquisition module obtains page files from a web server, and an HTML file analysis module extracts tags and characteristic values related to the layout. A page group detection module employs layout tags and their characteristic values to group page files that have the same or a similar layout. When an annotation addition module adds an annotation to an arbitrary page file in the obtained layout group, the annotation is applied for another page file in the layout group. When the layout group is divided or layout groups are unified, a correction module for the function of distance calculation corrects a calculation expression for a distance between pages or layout groups in order to reflect the division or unification results obtained by the user.
35 Citations
16 Claims
-
1. An information processing method comprising:
-
providing an annotation for multiple page files, including the steps of; obtaining a plurality of page files from a web site; generating a group of said page files, page layout structures of which are at least similar by analyzing said page files to introduce structural descriptive forms for said page layout structures and to assign characteristic values for said structural descriptive forms;
employing said structural descriptive forms and said characteristic values to calculate an inter-page distance representing a similarity of said page files; and
grouping said page files, of which said inter-page distance is equal to or smaller than a predetermined value;providing a first annotation for an arbitrary page file in said group; and correlating said first annotation with at least a part of other page files of said group; wherein said step of correlating said first annotation with said other page files in said group includes the steps of; determining whether said first annotation should be applied for the page files of said group; adding a second annotation, when the determination is false, for an arbitrary page file of a page group consisting of page files with which said first annotation is not correlated; correlating said second annotation with at least a part of other page files of said page group; and correcting a calculation expression for said inter-page distance, so that, at said step of generating a group, said page file with which said first annotation is correlated and said page files that are correlated with said second annotation do not fall in the same group. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An information processing method comprising:
-
providing an annotation for multiple page files, including the steps of; obtaining a plurality of page files from a web site; generating a plurality of groups of said page files, wherein page layout structures of each group being at least similar by analyzing said page files to introduce structural descriptive forms for said page layout structures and to assign characteristic values for said structural descriptive forms;
employing said structural descriptive forms and said characteristic values to calculate an inter-page distance representing a similarity of said page files; and
grouping said page files into said groups, wherein each group has an inter-page distance equal to or smaller than a predetermined value;providing a first annotation for an arbitrary page file in each said group; and correlating said first annotation with at least a part of other page files of each said group; introducing a representative structural descriptive form that represents said each group and a representative characteristic value for said representative structural descriptive form; employing said representative structural descriptive form and said representative characteristic value to calculate an inter-group distance that delineates the similarity between said groups; grouping said page files that are included in said groups, said inter-group distance of which is equal to or smaller than a predetermined value, and generating a common group; adding an added annotation to a common area wherein part of the page layout structure of an arbitrary file, included in common for the members of said common group, is the same as or similar to at least a part of the page layout structure of a different page file; and correlating said first annotation with said common area provided for said different page file included, in common, for said common group; wherein said step of correlating said first annotation with said common area provided for said different page file includes the steps of; determining whether said first annotation should be applied for said common area provided for the page files of said common group; adding a second annotation, when the determination is false, to the common area of an arbitrary page file of a page group consisting of page files including said common area with which said first annotation is not correlated; correlating said second annotation with ‘
Yes’
part of the common areas of other page files of said page group; andcorrecting a calculation expression for said intergroup distance, so that, at said step of generating a common group, said page file including said common area correlated with said first annotation and said page files including said common areas correlated with said second annotation do not fall in the same common group. - View Dependent Claims (8, 9)
-
-
10. An information processing system for providing an annotation for multiple page files, comprising:
-
means for obtaining page files from a web site; means for generating a group of said page files, page layout structures of which are the same or similar comprising means for analyzing said page files to introduce structural descriptive forms for said page layout structures and assign characteristic values for said structural descriptive forms;
means for employing said structural descriptive forms and said characteristic values to calculate an inter-page distance representing the similarity of said page files; and
means for grouping said page files, of which said inter-page distance is equal to or smaller than a predetermined value;means for providing a first annotation for an arbitrary page file in said group; and means for correlating said first annotation with at least a part of other page files of said group; wherein said means for correlating said first annotation with said other page files in said group includes; means for determining whether said first annotation should be applied for the page files of said group; means for adding a second annotation, when the determination is false, for an arbitrary page file of a page group consisting of page files with which said first annotation is not correlated; means for correlating said second annotation with at least a part of other page files of said page group; and means for correcting a calculation expression for said inter-page distance, so that, at said step of generating a group, said page file correlated with said first annotation and said page files correlated with said second annotation do not fall in the same group. - View Dependent Claims (11, 12, 13)
-
-
14. An information processing system, for providing an annotation for multiple page files, comprising:
-
means for obtaining page files from a web site; means for generating a plurality of groups of said page files, page layout structures of each group being the same or similar comprising means for analyzing said page files to introduce structural descriptive forms for said page layout structures and assign characteristic values for said structural descriptive forms;
means for employing said structural descriptive forms and said characteristic values to calculate an inter-page distance representing the similarity of said page files; and
means for grouping said page files, of which said inter-page distance is equal to or smaller than a predetermined value;means for providing a first annotation for an arbitrary page file in each said group; means for correlating said first annotation with at least a part of other page files of each said group; means for introducing a representative structural descriptive form that represents said groups and a representative characteristic value for said representative structural descriptive form; means for employing said representative structural descriptive form and said representative characteristic value to calculate an inter-group distance that delineates the similarity between said groups; means for grouping said page files that are included in said groups, said inter-group distance of which is equal to or smaller than a predetermined value, and generating a common group; means for adding an added annotation to a common area wherein part of the page layout structure of an arbitrary file, included in common for the members of said common group, is the same as or similar to at least a part of the page layout structure of a different page file; and means for correlating said annotation with said common area provided for said different page file included in common for said common group wherein said means for correlating said first annotation with said common area provided for said different page file includes; means for determining whether said first annotation should be applied for said common area provided for the page files of said common group; means for adding a second annotation, when the determination is false, to the common area of an arbitrary page file of a page group consisting of page files including said common area with which said first annotation is not correlated; means for correlating said second annotation with ‘
Yes’
part of the common areas of other page files of said page group; andmeans for correcting a calculation expression for said inter-group distance, so that, at said means for generating a common group, said page file including said common area correlated with said first annotation and said page files including said common areas correlated with said second annotation do not fall in the same common group. - View Dependent Claims (15, 16)
-
Specification