Managing URLs
First Claim
Patent Images
1. A system comprising:
- a processor, coupled to a memory, configured to;
check an index of crawled items for a number of index entries in the index, wherein the index entries include references to the crawled items; and
when the number of index entries is equal to or greater than a target number, select one or more of the index entries for deletion from the index based on an importance of the crawled items referenced by the index entries.
2 Assignments
0 Petitions
Accused Products
Abstract
Crawling pages is disclosed. Pages are crawled up to a target number of pages. Additional pages, that have an importance that is equal to or greater than an importance threshold, are crawled beyond the target number of pages. In some embodiments, pages having an importance less than an importance threshold are deleted.
-
Citations
58 Claims
-
1. A system comprising:
a processor, coupled to a memory, configured to; check an index of crawled items for a number of index entries in the index, wherein the index entries include references to the crawled items; and when the number of index entries is equal to or greater than a target number, select one or more of the index entries for deletion from the index based on an importance of the crawled items referenced by the index entries. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
16. A recordable storage medium having recorded and stored thereon instructions that, when executed, perform the actions of:
-
crawling items until a specified number of items is crawled; and when the specified number of items is crawled, crawling additional items only if the additional items comprise a designated criterion. - View Dependent Claims (17, 18, 19)
-
-
20. A method of crawling items, the method comprising:
-
crawling, using a processor of one or more devices, items until a specified number of items is crawled; and when the specified number of items is crawled, crawling, using a processor of the one or more devices, additional items only if the additional items comprise a designated criterion. - View Dependent Claims (21, 22, 23)
-
-
24. A system comprising:
a processor, coupled to a memory, configured to; crawl items until a specified number of items is crawled; and when the specified number of items is crawled, crawl additional items only if the additional items comprise a designated criterion. - View Dependent Claims (25, 26, 27)
-
28. A recordable storage medium having recorded and stored thereon instructions that, when executed, perform the actions of:
-
checking an index of crawled items for a number of index entries in the index, wherein the index entries include references to the crawled items; and when the number of index entries is equal to or greater than a target number, selecting one or more of the index entries for deletion from the index based on an importance of the crawled items referenced by the index entries. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
-
43. A method, comprising:
-
checking, using a processor of one or more devices, an index of crawled items for a number of index entries in the index, wherein the index entries include references to the crawled items; and when the number of index entries is equal to or greater than a target number, selecting, using a processor of the one or more devices, one or more of the index entries for deletion from the index based on an importance of the crawled items referenced by the index entries. - View Dependent Claims (44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57)
-
-
58. A method comprising:
-
crawling items up to a target number of items, wherein at least a subset of the crawled items are not constrained to have an importance; crawling additional items in the collection beyond the target number of items, wherein the additional items are constrained to have an importance that is equal to or greater than an importance threshold; providing, in an index, for each of at least a subset of the crawled items and the additional crawled items crawl data associated with the respective item; checking the index for a number of index entries in the index, wherein the index entries include references to the crawled items; and selecting, when the number of index entries is equal to or greater than a target number, one or more of the index entries for deletion from the index based on an importance of the crawled items referenced by the index entries.
-
Specification