Spatially directed crawling of documents
3 Assignments
0 Petitions
Accused Products
Abstract
An interface program stored on a computer-readable medium for causing a computer system with a display device to perform the functions of: accepting search criteria from a user including a free text entry query and a domain identifier identifying a domain; in response to accepting the search criteria, retrieving a plurality of record identifiers each of which identifies a corresponding record which: (1) has associated therewith a location identifier that locates it at a specific location within the domain identified by the domain identifier; and (2) contains information that is responsive to the free text entry query; displaying a representation of the domain on the display device; and displaying on the display device a plurality of icons as representations of the records identified by the plurality of record identifiers, wherein for each of the record identifiers, a corresponding one of the plurality of icons is displayed within the representation of the domain that is being displayed on the display device, the corresponding icon for each of the plurality of record identifiers being positioned within the representation of the domain at a coordinate within the domain that corresponds to the location identifier for the corresponding record.
-
Citations
46 Claims
-
1-32. -32. (canceled)
-
33. A method for populating a spatial document database with hyperlinked documents containing spatial information, the method comprising:
-
providing a destination database containing potential sources of gatherable documents;
providing a history database of known sources where documents have been gathered;
providing a crawler computer process which can follow a hyperlink in a document to access a potential source of gatherable documents specified by the hyperlink;
bootstrapping the crawler;
iterating the crawler over the destination database, including the steps of;
moving a potential source of gatherable documents from the destination database to the history database;
inspecting the potential source for gatherable documents;
storing any such gatherable documents in the spatial document database; and
adding to the destination database all potential sources of gatherable documents which are referenced by a hyperlink in the gatherable documents.
-
-
34-36. -36. (canceled)
-
37. A method for populating a document repository, said method using a page queue for storing document addresses, said method comprising:
-
retrieving a document address from the page queue;
loading into the document repository a document that is identified by the retrieved document address;
parsing the loaded document for links to new documents;
storing addresses of the new documents into the page queue along with a spatial relevance level for each stored address;
iteratively repeating the steps of retrieving, loading, parsing and storing to populate the document repository, and wherein retrieving involves using the spatial relevance levels of the stored addresses in the page queue to determine which document addresses are retrieved from the page queue. - View Dependent Claims (38, 39, 40, 41, 42, 43, 44, 45, 46)
-
Specification