Method and apparatus to retrieve information from a network
First Claim
1. A method to index network information, comprising:
- (a) establishing a list of parameters to be searched over a network;
(b) establishing a search list of weighted links;
(c) assigning a predetermined initial weight to each parent link in said search list;
(d) searching said network for files of information containing one or more of said parameters using said search list of weighted links to said files;
(e) parsing said information into content and additional links to additional files;
(f) weighting said content;
(g) copying said weighted content into a memory;
(h) comparing each of said additional links to a locally-stored excluded domain file, wherein said excluded domain file contains a list of irrelevant links;
(i) identifying those of said additional links that are found relevant in accordance with said comparison using said excluded domain file;
(j) assigning a predetermined link weight to each said relevant additional link, wherein each said relevant additional link is initially assigned an identical link weight that is different from said initial weight of a corresponding parent link;
(k) adjusting said link weight of each said relevant additional link to be more than, less than or equal to said initial weight of said corresponding parent link depending on at least one of the following;
whether said relevant additional link has been previously processed, whether said relevant additional link has been previously unprocessed, and a number of said parameters present in said content corresponding to said relevant additional link;
(l) copying said relevant additional weighted links to said search list; and
(m) performing steps (d)-(l) until an ending condition occurs.
6 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus to index network information is described. A network is searched for files of information relevant to people and resources in a particular field using a search list of weighted links to said files. The information is parsed into content and additional links to additional files. The content is weighted and copied to memory (such as a database). A determination is made as to whether the additional links are relevant to the people and resources in the given technical field. Those additional links that are relevant are weighted using a predetermined weighting algorithm. The relevant additional weighted links are copied to the search list. This process continues until an ending condition occurs.
-
Citations
28 Claims
-
1. A method to index network information, comprising:
-
(a) establishing a list of parameters to be searched over a network;
(b) establishing a search list of weighted links;
(c) assigning a predetermined initial weight to each parent link in said search list;
(d) searching said network for files of information containing one or more of said parameters using said search list of weighted links to said files;
(e) parsing said information into content and additional links to additional files;
(f) weighting said content;
(g) copying said weighted content into a memory;
(h) comparing each of said additional links to a locally-stored excluded domain file, wherein said excluded domain file contains a list of irrelevant links;
(i) identifying those of said additional links that are found relevant in accordance with said comparison using said excluded domain file;
(j) assigning a predetermined link weight to each said relevant additional link, wherein each said relevant additional link is initially assigned an identical link weight that is different from said initial weight of a corresponding parent link;
(k) adjusting said link weight of each said relevant additional link to be more than, less than or equal to said initial weight of said corresponding parent link depending on at least one of the following;
whether said relevant additional link has been previously processed, whether said relevant additional link has been previously unprocessed, and a number of said parameters present in said content corresponding to said relevant additional link;
(l) copying said relevant additional weighted links to said search list; and
(m) performing steps (d)-(l) until an ending condition occurs. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
selecting a weighted link from said search list based on a predetermined ranking of each said weighted link within said search list;
retrieving a selected file of information corresponding to said selected weighted link; and
removing said selected weighted link from said search list.
-
-
9. The method of claim 8, wherein said retrieving said selected file comprises:
-
determining whether said searching is for a master build process;
retrieving said selected file of information from said network if said searching is for said master build process;
determining whether said selected file of information has been previously indexed into said memory if said searching is not for said master build process;
retrieving said selected file of information from said memory if said selected file of information has been previously indexed into said memory; and
retrieving said selected file of information from said network if said selected file of information has not been previously indexed into said memory.
-
-
10. The method of claim 8, wherein said selecting comprises:
-
ranking each weighted link from lowest weight to highest weight; and
selecting a weighted link with a lowest weight.
-
-
11. The method of claim 8, wherein said selecting comprises:
-
ranking each weighted link from highest weight to lowest weight; and
selecting a weighted link with a highest weight.
-
-
12. The method of claim 1, further comprising copying said additional links to a parent-child table.
-
13. The method of claim 1, wherein said parameters represents at least one of the following categories of information:
- text;
images;
data;
meta tags;
program instructions; and
graphics.
- text;
-
14. A machine-readable medium whose contents cause a computer system to index network information by performing the following:
-
(a) establishing a list of parameters to be searched over a network;
(b) establishing a search list of weighted links;
(c) assigning a predetermined initial weight to each parent link in said search list;
(d) searching said network for files of information containing one or more of said parameters using said search list of weighted links to said files;
(e) parsing said information into content and additional links to additional files;
(f) weighting said content;
(g) copying said weighted content into a memory;
(h) comparing each of said additional links to a locally-stored excluded domain file, wherein said excluded domain file contains a list of irrelevant links;
(i) identifying those of said additional links that are found relevant in accordance with said comparison using said excluded domain file;
(j) assigning a predetermined link weight to each said relevant additional link, wherein each said relevant additional link is initially assigned an identical link weight that is different from said initial weight of a corresponding parent link;
(k) adjusting said link weight of each said relevant additional link to be more than, less than or equal to said initial weight of said corresponding parent link depending on at least one of the following;
whether said relevant additional link has been previously processed, whether said relevant additional link has been previously unprocessed, and a number of said parameters present in said content corresponding to said relevant additional link;
(l) copying said relevant additional weighted links to said search list; and
(m) performing steps (d)-(l) until an ending condition occurs. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
selecting a weighted link from said search list based on a predetermined ranking of each said weighted link within said search list;
retrieving a selected file of information corresponding to said selected weighted link; and
removing said selected weighted link from said search list.
-
-
22. The machine-readable medium of claim 21, wherein said retrieving said selected file comprises:
-
determining whether said searching is for a master build process;
retrieving said selected file of information from said network if said searching is for said master build process;
determining whether said selected file of information has been previously indexed into said memory if said searching is not for said master build process;
retrieving said selected file of information from said memory if said selected file of information has been previously indexed into said memory; and
retrieving said selected file of information from said network if said selected file of information has not been previously indexed into said memory.
-
-
23. The machine-readable medium of claim 21, wherein said selecting comprises:
-
ranking each weighted link from lowest weight to highest weight; and
selecting a weighted link with a lowest weight.
-
-
24. The machine-readable medium of claim 21, wherein said selecting comprises:
-
ranking each weighted link from highest weight to lowest weight; and
selecting a weighted link with a highest weight.
-
-
25. The machine-readable medium of claim 14, further comprising copying said additional links to a parent-child table.
-
26. The machine-readable medium of claim 14, wherein said parameters represent at least one of the following categories of information:
- text;
images;
data;
meta tags;
program instructions; and
graphics.
- text;
-
27. An apparatus to index network information, comprising:
-
(a) means for establishing a list of parameters to be searched over a network;
(b) means for establishing a search list of weighted links;
(c) means for assigning a predetermined initial weight to each parent link in said search list;
(d) means for searching said network for files of information containing one or more of said parameters using said search list of weighted links to said files;
(e) means for parsing said information into content and additional links to additional files;
(f) means for weighting said content;
(g) means for copying said weighted content into a memory;
(h) means for comparing each of said additional links to a locally-stored excluded domain file, wherein said excluded domain file contains a list of irrelevant links;
(i) means for identifying those of said additional links that are found relevant in accordance with said comparison using said excluded domain file;
(j) means for assigning a predetermined link weight to each said relevant additional link, wherein each said relevant additional link is initially assigned an identical link weight that is different from said initial weight of a corresponding parent link;
(k) means for adjusting said link weight of each said relevant additional link to be more than, less than or equal to said initial weight of said corresponding parent link depending on at least one of the following;
whether said relevant additional link has been previously processed, whether said relevant additional link has been previously unprocessed, and a number of said parameters present in said content corresponding to said relevant additional link;
(l) means for copying said relevant additional weighted links to said search list; and
(m) means for performing functions in (d)-(l) until an ending condition occurs. - View Dependent Claims (28)
-
Specification