Information rearrangement method, information processing apparatus and information processing system, and storage medium and program transmission apparatus therefor
First Claim
1. An information rearrangement method for rearranging information obtained from information sources connected via a network comprising:
- an information collection step of collecting information from a predetermined number of registered sites;
an information element extraction step of extracting, from among said collected information, information elements that include the same facts that are referred to at multiple sites, said information element extraction step comprising selecting, from keywords of information elements included in one set, a keyword having an appearance rate that is equal to or greater than a threshold value, said keywords comprising words that are searchable, rankable, extractable, representations of information elements for determining the same facts included in the information elements, and chosen from the group consisting of anchors, links, text, nouns, predetermined proper nouns, and predetermined verbs; and
a display step of displaying the contents of said extracted information elements while changing the display state of said contents in accordance with the number of sites whereat said facts are referred to;
said step of displaying comprising extracting a set of important information elements on a sentence level from a group composed of a predetermined number of sites, and folding the display for the same sets of important information elements.
3 Assignments
0 Petitions
Accused Products
Abstract
An information processing system, for processing information obtained from multiple sites that are connected via the Internet 10, includes: a webcrawler 13, for crawling sites, across the Internet 10, which are registered in a registered site DB 11; a metadata DB 12, for storing metadata from which information elements are extracted from content referred to by using a URL; an important information element extraction mechanism 30, for reading information stored in the metadata DB 12, and for extracting important information elements based on the matching level of information elements; an important information element DB 40, for storing the extracted important information elements; and a result display mechanism 41, for visually presenting said stored important information elements.
-
Citations
8 Claims
-
1. An information rearrangement method for rearranging information obtained from information sources connected via a network comprising:
- an information collection step of collecting information from a predetermined number of registered sites;
an information element extraction step of extracting, from among said collected information, information elements that include the same facts that are referred to at multiple sites, said information element extraction step comprising selecting, from keywords of information elements included in one set, a keyword having an appearance rate that is equal to or greater than a threshold value, said keywords comprising words that are searchable, rankable, extractable, representations of information elements for determining the same facts included in the information elements, and chosen from the group consisting of anchors, links, text, nouns, predetermined proper nouns, and predetermined verbs; and
a display step of displaying the contents of said extracted information elements while changing the display state of said contents in accordance with the number of sites whereat said facts are referred to;
said step of displaying comprising extracting a set of important information elements on a sentence level from a group composed of a predetermined number of sites, and folding the display for the same sets of important information elements. - View Dependent Claims (2)
- an information collection step of collecting information from a predetermined number of registered sites;
-
3. An information rearrangement method for rearranging information obtained from information sources connected via a network comprising:
- an information collection step of collecting information from a predetermined number of registered sites;
an information element extraction step of extracting, from among said collected information, information elements that include the same facts that are referred to at multiple sites, said information element extraction step comprising selecting, from keywords of information elements included in one set, a keyword having an appearance rate that is equal to or greater than a threshold value, said keywords comprising words that are searchable, rankable, extractable, representations of information elements for determining the same facts included in the information elements, and chosen from the group consisting of anchors, links, text, nouns, predetermined proper nouns, and predetermined verbs; and
a topic keyword extraction step of extracting a topic keyword that represents the entire set of information elements to be extracted;
said topic keyword extraction step comprising a representative keyword extraction step, a set representative keyword extraction step, and a topic keyword collection step, and a display step of displaying the contents of said extracted information elements, while displaying said extracted topic keyword at a position different from the contents concerning said information elements. - View Dependent Claims (4)
- an information collection step of collecting information from a predetermined number of registered sites;
-
5. An information rearrangement method comprising the steps of:
- extracting information elements from multiple sites;
said step of extracting information elements comprising selecting, from keywords of information elements included in one set, a keyword having an appearance rate that is equal to or greater than a threshold value, said keywords comprising words that are searchable, rankable, extractable, representations of information elements for determining the same facts included in the information elements, and chosen from the group consisting of anchors, links, text, nouns, predetermined proper nouns, and predetermined verbs; and
determining whether, of said information elements extracted from said multiple sites, there are relevant information elements that convey the same facts as sentence-level information elements that constitute an arbitrary web page; and
when said relevant information elements that include the same facts as said sentence-level information elements are present in said information elements obtained from said multiple sites, adding remark information to said sentence-level information elements to provide information concerning said arbitrary web page. - View Dependent Claims (6)
- extracting information elements from multiple sites;
-
7. A storage medium on which a computer-readable program is stored, which permits a computer to perform:
- a process for collecting information from a predetermined number of registered sites;
a process for extracting, from among said collected information, information elements that include the same facts that are referred to at multiple sites said information element extraction process comprising selecting, from keywords of information elements included in one set, a keyword having an appearance rate that is equal to or greater than a threshold value, said keywords comprising words that are searchable, rankable, extractable, representations of information elements for determining the same facts included in the information elements, and chosen from the group consisting of anchors, links, text, nouns, predetermined proper nouns, and predetermined verbs; and
a process for displaying the contents of said extracted information elements while changing the display state of said contents in accordance with the number of sites whereat said facts are referred to;
said process for displaying comprising extracting a set of important information elements on a sentence level from a group composed of a predetermined number of sites, and folding the display for the same sets of important information elements.
- a process for collecting information from a predetermined number of registered sites;
-
8. A storage medium on which a computer-readable program is stored, which permits a computer to perform:
- a process for collecting information from a predetermined number of registered sites;
a process for extracting, from among said collected information, information elements that include the same facts that are referred to at multiple sites, said information element extraction processing step comprising selecting, from keywords of information elements included in one set, a keyword having an appearance rate that is equal to or greater than a threshold value, said keywords comprising words that are searchable, rankable, extractable, representations of information elements for determining the same facts included in the information elements, and chosen from the group consisting of anchors, links, text, nouns, predetermined proper nouns, and predetermined verbs;
a process for extracting a topic keyword that represents the entire set of information elements to be extracted; and
a process for displaying the contents of said extracted information elements, while displaying said extracted topic keyword at a position different from the contents concerning said information elements.
- a process for collecting information from a predetermined number of registered sites;
Specification