Systems and methods for creating, navigating, and searching informational web neighborhoods
First Claim
Patent Images
1. A computer implemented method, the method comprising:
- performing probabilistic percolation crawling from one or more web pages, wherein the one or more web pages comprise one or more reference links, and wherein performing probabilistic percolation crawling comprises following the one or more reference links in and out of the one or more web pages to one or more neighboring nodes probabilistically, wherein performing percolation crawling further comprises randomly selecting reference links in and out of the web page and in and out of the one or more neighboring nodes, wherein selected reference out-links are added to a linked database when the link satisfies a first probability and selected reference in-links are added to the linked database when the link satisfies a second probability; and
generating a structural web community neighborhood based on the percolation crawling from the at least one of the one or more web pages by iteratively partitioning the linked database into overlapping communities, the structured web community neighborhood comprising a plurality of communities of network nodes linked by edges around the one of the web pages, each of the plurality of communities comprising a set of network nodes that are more linked amongst themselves than to network nodes that are not included in the community.
7 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for the creation of hierarchical networks of overlapping informational web neighborhoods using percolation crawling. Each neighborhood comprises a set of closely linked pages that share a common set of concepts and intent and purpose. The neighborhoods represent web pages that share a common set of underlying concepts and semantic associations. Each such neighborhood can be semantically tagged.
-
Citations
22 Claims
-
1. A computer implemented method, the method comprising:
-
performing probabilistic percolation crawling from one or more web pages, wherein the one or more web pages comprise one or more reference links, and wherein performing probabilistic percolation crawling comprises following the one or more reference links in and out of the one or more web pages to one or more neighboring nodes probabilistically, wherein performing percolation crawling further comprises randomly selecting reference links in and out of the web page and in and out of the one or more neighboring nodes, wherein selected reference out-links are added to a linked database when the link satisfies a first probability and selected reference in-links are added to the linked database when the link satisfies a second probability; and generating a structural web community neighborhood based on the percolation crawling from the at least one of the one or more web pages by iteratively partitioning the linked database into overlapping communities, the structured web community neighborhood comprising a plurality of communities of network nodes linked by edges around the one of the web pages, each of the plurality of communities comprising a set of network nodes that are more linked amongst themselves than to network nodes that are not included in the community. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. An informational network neighborhood comprising:
-
a memory configured to store a representation of a structured web community neighborhood within a network of linked nodes, the structured web community neighborhood comprising a plurality of communities of network nodes linked by edges around a web page, and wherein the web page and the set of network nodes comprise one or more reference links, each of the plurality of communities comprising a set of network nodes that are more linked amongst themselves than to network nodes that are not included in the community based on an analysis of the one or more reference links of the web page and the set of network nodes; and a processor configured to perform probabilistic percolation crawling to construct the structured web community neighborhood, wherein performing probabilistic percolation crawling comprises following the one or more reference links in and out of the one or more web pages to one or more network nodes probabilistically, wherein performing percolation crawling further comprises randomly selecting reference links in and out of the web page and in and out of the one or more neighboring nodes, wherein selected reference out-links are added to a linked database when the link satisfies a first probability and selected reference in-links are added to the linked database when the link satisfies a second probability. - View Dependent Claims (15, 16, 17, 18)
-
-
19. A computer implemented method for conducting an informational search, the method comprising:
-
identifying a community of linked network nodes associated with a set of concepts using probabilistic percolation crawling from a web page, wherein performing probabilistic percolation crawling comprises following the one or more reference links in and out of the one or more web pages to one or more neighboring nodes probablistically, wherein performing percolation crawling further comprises randomly selecting reference links in and out of the web page and in and out of the one or more neighboring nodes, wherein selected reference out-links are added to the community when the link satisfies a first probability and selected reference in-links are added to the community when the link satisfies a second probability and wherein each node is related to the community by at least one of the set of concepts; and annotating the community with a community concept. - View Dependent Claims (20, 21, 22)
-
Specification