Methods and systems for prioritizing a crawl
First Claim
1. A method for crawling articles on a storage device local to a client device, comprising:
- (a) identifying, by an application executing on the client device, a plurality of storage locations located on the client device, each storage location storing a plurality of articles;
(b) identifying, by the application executing on the client device, events performed by a user of the client device, wherein the events are associated with the plurality of articles;
(c) ranking, by the application executing on the client device, the plurality of storage locations based at least in part on the events; and
(d) crawling, by the application executing on the client device, the plurality of storage locations based at least in part on the ranking, the crawling comprising;
identifying, for one of the storage locations, a duplicate set of storage locations;
crawling the one of the storage locations and repressing crawls of the duplicate set of storage locations;
indexing the plurality of articles of at least the one of the storage locations, anddetermining a time to re-crawl based on the plurality of articles of at least the one of the storage locations.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for prioritizing a crawl are described. One aspect of the invention includes a method for identifying a plurality of storage locations each comprising a plurality of articles, ranking the plurality of storage locations based at least in part on events associated with the plurality of articles, and crawling the storage locations based at least in part on the ranking. Another aspect of the invention includes identifying a plurality of storage locations each comprising a plurality of articles, identifying a plurality of types of the plurality of articles, ranking the plurality of storage locations based at least in part on the plurality of types of the plurality of articles; and crawling the storage locations based at least in part on the ranking.
-
Citations
77 Claims
-
1. A method for crawling articles on a storage device local to a client device, comprising:
-
(a) identifying, by an application executing on the client device, a plurality of storage locations located on the client device, each storage location storing a plurality of articles; (b) identifying, by the application executing on the client device, events performed by a user of the client device, wherein the events are associated with the plurality of articles; (c) ranking, by the application executing on the client device, the plurality of storage locations based at least in part on the events; and (d) crawling, by the application executing on the client device, the plurality of storage locations based at least in part on the ranking, the crawling comprising; identifying, for one of the storage locations, a duplicate set of storage locations; crawling the one of the storage locations and repressing crawls of the duplicate set of storage locations; indexing the plurality of articles of at least the one of the storage locations, and determining a time to re-crawl based on the plurality of articles of at least the one of the storage locations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method for crawling articles on a storage device local to a client device, comprising:
-
(a) identifying, by an application executing on the client device, a plurality of storage locations located on the client device, each storage location storing a plurality of articles; (b) identifying, by the application executing on the client device, a plurality of types of the plurality of articles; (c) ranking, by the application executing on the client device, the plurality of storage locations based at least in part on the plurality of types of the plurality of articles stored by each storage location; and (d) crawling, by the application executing on the client device, the plurality of storage locations based at least in part on the ranking, the crawling comprising; identifying, for one of the storage locations, a duplicate set of storage locations; crawling the one of the storage locations and repressing crawls of the duplicate set of storage locations; indexing the plurality of articles of at least the one of the storage locations, and determining a time to re-crawl based on the plurality of articles of at least the one of the storage locations. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. A non-transitory computer-readable storage medium containing executable program code, comprising:
-
(a) program code for execution on a client device for identifying a plurality of storage locations located on the client device, each storage location storing a plurality of articles; (b) program code for execution on the client device for identifying events performed by a user of the client device, wherein the events are associated with the plurality of articles; (c) program code for execution on the client device for ranking the plurality of storage locations based at least in part on events; and (d) program code for execution on the client device for crawling the plurality of storage locations based at least in part on the ranking, the crawling comprising; identifying, for one of the storage locations, a duplicate set of storage locations; crawling the one of the storage locations and repressing crawls of the duplicate set of storage locations; indexing the plurality of articles of at least the one of the storage locations, and determining a time to re-crawl based on the plurality of articles of at least the one of the storage locations. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58)
-
-
59. A non-transitory computer-readable storage medium containing executable program code, comprising:
-
(a) program code for execution on a client device for identifying a plurality of storage locations located on the client device, each storage location storing a plurality of articles; (b) program code for execution on the client device for identifying a plurality of types of the plurality of articles; (c) program code for execution on the client device for ranking the plurality of storage locations based at least in part on the plurality of types of the plurality of articles; and (d) program code for execution on the client device for crawling the plurality of storage locations based at least in part on the ranking, the crawling comprising; identifying, for one of the storage locations, a duplicate set of storage locations; crawling the one of the storage locations and repressing crawls of the duplicate set of storage locations; indexing the plurality of articles of at least the one of the storage locations, and determining a time to re-crawl based on the plurality of articles of at least the one of the storage locations. - View Dependent Claims (60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76)
-
-
77. A system for crawling local articles, comprising:
-
a computer processor for executing computer program instructions; a computer-readable storage medium having executable computer program instructions tangibly embodied thereon, the executable computer program instructions comprising instructions for; (a) identifying a plurality of storage locations located on the system, each storage location storing a plurality of articles; (b) identifying a plurality of types of the plurality of articles; (c) ranking the plurality of storage locations based at least in part on the plurality of types of the plurality of articles stored by each storage location and based at least in part on events performed by a user of the system wherein the events are associated with the plurality of articles; and (d) crawling the plurality of storage locations based at least in part on the ranking, the crawling comprising; identifying, for one of the storage locations, a duplicate set of storage locations; crawling the one of the storage locations and repressing crawls of the duplicate set of storage locations; indexing the plurality of articles of at least the one of the storage locations, and determining a time to re-crawl based on the plurality of articles of at least the one of the storage locations.
-
Specification