Systems and methods for creating, navigating, and searching informational web neighborhoods

US 9,443,018 B2
Filed: 08/12/2014
Issued: 09/13/2016
Est. Priority Date: 01/19/2006
Status: Active Grant

First Claim

Patent Images

1. A computer implemented method, the method comprising:

performing probabilistic percolation crawling from one or more web pages, wherein the one or more web pages comprise one or more reference links, and wherein performing probabilistic percolation crawling comprises following the one or more reference links in and out of the one or more web pages to one or more neighboring nodes probabilistically, wherein performing percolation crawling further comprises randomly selecting reference links in and out of the web page and in and out of the one or more neighboring nodes, wherein selected reference out-links are added to a linked database when the link satisfies a first probability and selected reference in-links are added to the linked database when the link satisfies a second probability; and

generating a structural web community neighborhood based on the percolation crawling from the at least one of the one or more web pages by iteratively partitioning the linked database into overlapping communities, the structured web community neighborhood comprising a plurality of communities of network nodes linked by edges around the one of the web pages, each of the plurality of communities comprising a set of network nodes that are more linked amongst themselves than to network nodes that are not included in the community.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for the creation of hierarchical networks of overlapping informational web neighborhoods using percolation crawling. Each neighborhood comprises a set of closely linked pages that share a common set of concepts and intent and purpose. The neighborhoods represent web pages that share a common set of underlying concepts and semantic associations. Each such neighborhood can be semantically tagged.

Citations

22 Claims

1. A computer implemented method, the method comprising:
- performing probabilistic percolation crawling from one or more web pages, wherein the one or more web pages comprise one or more reference links, and wherein performing probabilistic percolation crawling comprises following the one or more reference links in and out of the one or more web pages to one or more neighboring nodes probabilistically, wherein performing percolation crawling further comprises randomly selecting reference links in and out of the web page and in and out of the one or more neighboring nodes, wherein selected reference out-links are added to a linked database when the link satisfies a first probability and selected reference in-links are added to the linked database when the link satisfies a second probability; and
  
  generating a structural web community neighborhood based on the percolation crawling from the at least one of the one or more web pages by iteratively partitioning the linked database into overlapping communities, the structured web community neighborhood comprising a plurality of communities of network nodes linked by edges around the one of the web pages, each of the plurality of communities comprising a set of network nodes that are more linked amongst themselves than to network nodes that are not included in the community.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1, further comprising annotating each of the plurality of communities of network nodes in the structural web community with a concept.
  - 3. The method of claim 2 further comprising storing the annotated structural web neighborhood.
  - 4. The method of claim 1 further comprising performing semantic analysis of contents of the one or more neighboring nodes.
  - 5. The method of claim 4 further comprising determining relevance of the one or more neighboring nodes based at least in part on the semantic analysis.
  - 6. The method of claim 5 further comprising determining a type of relevance of the one or more neighboring nodes based at least in part on the semantic analysis.
  - 7. The method of claim 6 further comprising selectively discarding selected ones of the one or more neighboring nodes based on the relevance and type of relevance of the selected ones.
  - 8. The method of claim 1 wherein at least one of the links originates at one of the neighboring nodes.
  - 9. The method of claim 8 wherein the semantic analysis identifies a plurality of concepts in the one or more neighboring nodes.
  - 10. The method of claim 9 wherein the plurality of concepts comprises concepts identified from patterns of terms in the contents of the one or more neighboring nodes.
  - 11. The method of claim 9 wherein the plurality of concepts comprises predefined concepts associated with certain of the nodes.
  - 12. The method of claim 9, wherein relevance is determined by matching one or more of the plurality of concepts with a set of concepts associated with the structural web community neighborhood.
  - 13. The method of claim 1, further comprising:
    - performing a semantic analysis to determine a relevance of a neighboring node to at least one other neighboring node; and
      
      if the relevance of the network node exceeds a threshold, associating the neighboring node with the at least one other neighboring node to form one of the plurality of communities.

14. An informational network neighborhood comprising:
- a memory configured to store a representation of a structured web community neighborhood within a network of linked nodes, the structured web community neighborhood comprising a plurality of communities of network nodes linked by edges around a web page, and wherein the web page and the set of network nodes comprise one or more reference links, each of the plurality of communities comprising a set of network nodes that are more linked amongst themselves than to network nodes that are not included in the community based on an analysis of the one or more reference links of the web page and the set of network nodes; and
  
  a processor configured to perform probabilistic percolation crawling to construct the structured web community neighborhood, wherein performing probabilistic percolation crawling comprises following the one or more reference links in and out of the one or more web pages to one or more network nodes probabilistically, wherein performing percolation crawling further comprises randomly selecting reference links in and out of the web page and in and out of the one or more neighboring nodes, wherein selected reference out-links are added to a linked database when the link satisfies a first probability and selected reference in-links are added to the linked database when the link satisfies a second probability.
- View Dependent Claims (15, 16, 17, 18)
- - 15. The informational network neighborhood of claim 14 wherein the processor is further configured to generate the structural web community based on the percolation crawling by iteratively partitioning the linked database into overlapping communities.
  - 16. The informational network neighborhood of claim 14 wherein the processor is further configured to assign a concept to each community based at least in part on a semantic analysis of the community.
  - 17. The informational network neighborhood of claim 16 wherein at least one concept is assigned using a text mining tool.
  - 18. The informational network neighborhood of claim 17 wherein the text mining tool is selected from the group consisting of a latent semantic indexing tool and a natural language processing tool.

19. A computer implemented method for conducting an informational search, the method comprising:
- identifying a community of linked network nodes associated with a set of concepts using probabilistic percolation crawling from a web page, wherein performing probabilistic percolation crawling comprises following the one or more reference links in and out of the one or more web pages to one or more neighboring nodes probablistically, wherein performing percolation crawling further comprises randomly selecting reference links in and out of the web page and in and out of the one or more neighboring nodes, wherein selected reference out-links are added to the community when the link satisfies a first probability and selected reference in-links are added to the community when the link satisfies a second probability and wherein each node is related to the community by at least one of the set of concepts; and
  
  annotating the community with a community concept.
- View Dependent Claims (20, 21, 22)
- - 20. The method of claim 19 wherein the community is maintained in a database with one or more other communities.
  - 21. The method of claim 20 wherein the database is configured to maintain a cumulative informational network neighborhood comprising the community and the one or more other communities.
  - 22. The method of claim 19, further comprising:
    - performing a semantic analysis to determine a relevance of a neighboring node to at least one other neighboring node; and
      
      if the relevance of the neighboring node exceeds a threshold, associating the neighboring node with the at least one other neighboring node to form the community.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NetSeer, Inc. (Inuvo, Inc.)
Original Assignee
NetSeer, Inc. (Inuvo, Inc.)
Inventors
Rezaei, Behnam Attaran, Muntz, Alice Hwei-Yuan Meng
Primary Examiner(s)
HICKS, MICHAEL J

Application Number

US14/457,693
Publication Number

US 20140351237A1
Time in Patent Office

763 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06F 16/282   Hierarchical databases, e.g...

G06F 16/9024   Graphs; Linked lists G06F16...

G06F 16/90344   by using string matching te...

G06F 16/951   Indexing; Web crawling tech...

G06F 16/958   Organisation or management ...

G06N 7/00   Computing arrangements base...

Systems and methods for creating, navigating, and searching informational web neighborhoods

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for creating, navigating, and searching informational web neighborhoods

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links