Global anchor text processing
First Claim
1. A method for building a search index:
- while building the search index and using the search index to respond to one or more search requests and performing synchronous anchor text processing,maintaining an anchor information store, wherein each entry of the anchor information store identifies a referring document, a target document, and anchor text associated with a link from the referring document to the target document;
maintaining a rebuild agenda, wherein each entry of the rebuild agenda identifies a target document that has an entry in the search index and whose anchor text is to be updated in the search index with asynchronous processing because there is at least one new or updated link pointing to the target document;
receiving a document for processing;
for each outgoing link in the document that points to a target document,adding an entry to the anchor information store that identifies the received document, the target document, and anchor text; and
adding an entry to the rebuild agenda for the target document; and
for each link pointing from a referring document to the document,locating one or more entries in the anchor information store for which the received document to be processed is identified as the target document;
retrieving anchor text from each of the identified entries; and
storing the retrieved anchor text in an entry of the search index for the received document; and
performing asynchronous anchor text processing to incrementally update current entries in the search index for each document identified in each entry in the rebuild agenda in parallel with the building of the search index and in parallel with using the search index to respond to one or more search requests by;
selecting a first target document in the rebuild agenda;
using the anchor information store to find anchor text for the first target document by identifying one or more entries in the anchor information store for which the first target document is identified as the target document in the anchor information store;
retrieving anchor text from each of the identified entries; and
incrementally updating the anchor text in the entry of the search index for the first target document.
1 Assignment
0 Petitions
Accused Products
Abstract
Provided are techniques for building a search index. While building the search index and using the search index to respond to one or more search requests, an anchor information store is maintained, wherein each entry of the anchor information store identifies a referring document, a target document, and anchor text associated with a link from the referring document to the target document; a document is received for processing; one or more entries in the anchor information store for which the document to be processed is identified as the target document are located; anchor text is retrieved from each of the identified entries; and the retrieved anchor text is stored in an entry of the search index for the document.
9 Citations
24 Claims
-
1. A method for building a search index:
-
while building the search index and using the search index to respond to one or more search requests and performing synchronous anchor text processing, maintaining an anchor information store, wherein each entry of the anchor information store identifies a referring document, a target document, and anchor text associated with a link from the referring document to the target document; maintaining a rebuild agenda, wherein each entry of the rebuild agenda identifies a target document that has an entry in the search index and whose anchor text is to be updated in the search index with asynchronous processing because there is at least one new or updated link pointing to the target document; receiving a document for processing; for each outgoing link in the document that points to a target document, adding an entry to the anchor information store that identifies the received document, the target document, and anchor text; and adding an entry to the rebuild agenda for the target document; and for each link pointing from a referring document to the document, locating one or more entries in the anchor information store for which the received document to be processed is identified as the target document; retrieving anchor text from each of the identified entries; and storing the retrieved anchor text in an entry of the search index for the received document; and performing asynchronous anchor text processing to incrementally update current entries in the search index for each document identified in each entry in the rebuild agenda in parallel with the building of the search index and in parallel with using the search index to respond to one or more search requests by; selecting a first target document in the rebuild agenda; using the anchor information store to find anchor text for the first target document by identifying one or more entries in the anchor information store for which the first target document is identified as the target document in the anchor information store; retrieving anchor text from each of the identified entries; and incrementally updating the anchor text in the entry of the search index for the first target document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product comprising a computer readable storage medium storing a computer readable program, wherein the computer readable program when executed by a processor on a computer causes the computer to:
-
while building the search index and using the search index to respond to one or more search requests and performing synchronous anchor text processing, maintain an anchor information store, wherein each entry of the anchor information store identifies a referring document, a target document, and anchor text associated with a link from the referring document to the target document; maintain a rebuild agenda, wherein each entry of the rebuild agenda identifies a target document that has an entry in the search index and whose anchor text is to be updated in the search index with asynchronous processing because there is at least one new or updated link pointing to the target document; receive a document for processing; for each outgoing link in the document that points to a target document, add an entry to the anchor information store that identifies the received document, the target document, and anchor text; and add an entry to the rebuild agenda for the target document; and for each link pointing from a referring document to the document, locate one or more entries in the anchor information store for which the received document to be processed is identified as the target document; retrieve anchor text from each of the identified entries; and store the retrieved anchor text in an entry of the search index for the received document; and perform asynchronous anchor text processing to incrementally update current entries in the search index for each document identified in each entry in the rebuild agenda in parallel with the building of the search index and in parallel with using the search index to respond to one or more search requests by; selecting a first target document in the rebuild agenda; using the anchor information store to find anchor text for the first target document by identifying one or more entries in the anchor information store for which the first target document is identified as the target document in the anchor information store; retrieving anchor text from each of the identified entries; and updating the anchor text in the entry of the search index for the first target document. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A system for building a search index, comprising:
hardware logic capable of performing operations, the operations comprising; while building the search index and using the search index to respond to one or more search requests and performing synchronous anchor text processing; maintaining an anchor information store, wherein each entry of the anchor information store identifies a referring document, a target document, and anchor text associated with a link from the referring document to the target document; maintaining a rebuild agenda, wherein each entry of the rebuild agenda identifies a target document that has an entry in the search index and whose anchor text is to be updated in the search index with asynchronous processing because there is at least one new or updated link pointing to the target document; receiving a document for processing; for each outgoing link in the document that points to a target document, adding an entry to the anchor information store that identifies the received document, the target document, and anchor text; and adding an entry to the rebuild agenda for the target document; and for each link pointing from a referring document to the document, locating one or more entries in the anchor information store for which the received document to be processed is identified as the target document; retrieving anchor text from each of the identified entries; and storing the retrieved anchor text in an entry of the search index for the received document; and performing asynchronous anchor text processing to incrementally update current entries in the search index for each document identified in each entry in the rebuild agenda in parallel with the building of the search index and in parallel with using the search index to respond to one or more search requests by; selecting a first target document in the rebuild agenda; using the anchor information store to find anchor text for the first target document by identifying one or more entries in the anchor information store for which the first target document is identified as the target document in the anchor information store; retrieving anchor text from each of the identified entries; and updating the anchor text in the entry of the search index for the first target document. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
Specification