Auto generation of suggested links in a search system
First Claim
1. A method of automatically generating suggested links in a search system, the method comprising:
- initiating a first crawl across an enterprise corpus owned by an enterprise;
discovering during the first crawl a link pointing to a data source, the data source being mis-characterized during the first crawl as outside a boundary of the enterprise corpus owned by the enterprise;
automatically storing the link as a first suggested link with other suggested links in a memory;
initiating a second crawl across the enterprise corpus after the automatically storing, the second crawl having a different seed uniform resource locator (URL) or different boundary rules than the first crawl;
encountering during the second crawl the data source actually within the same boundary of the enterprise corpus;
removing, using a processor operatively coupled to the memory, the first suggested link from the other suggested links based on encountering the data source, previously characterized as outside the boundary of the enterprise corpus, within the same boundary of the enterprise corpus during the second crawl; and
determining relevancy scoring for the other suggested links.
1 Assignment
0 Petitions
Accused Products
Abstract
A flexible and extensible architecture allows for secure searching across an enterprise. Such an architecture can provide a simple Internet-like search experience to users searching secure content inside (and outside) the enterprise. The architecture allows for the crawling and searching of a variety of sources across an enterprise, regardless of whether any of these sources conform to a conventional user role model. The architecture further allows for security attributes to be submitted at query time, for example, in order to provide real-time secure access to enterprise resources. The user query also can be transformed to provide for dynamic querying that provides for a more current result list than can be obtained for static queries.
221 Citations
20 Claims
-
1. A method of automatically generating suggested links in a search system, the method comprising:
-
initiating a first crawl across an enterprise corpus owned by an enterprise; discovering during the first crawl a link pointing to a data source, the data source being mis-characterized during the first crawl as outside a boundary of the enterprise corpus owned by the enterprise; automatically storing the link as a first suggested link with other suggested links in a memory; initiating a second crawl across the enterprise corpus after the automatically storing, the second crawl having a different seed uniform resource locator (URL) or different boundary rules than the first crawl; encountering during the second crawl the data source actually within the same boundary of the enterprise corpus; removing, using a processor operatively coupled to the memory, the first suggested link from the other suggested links based on encountering the data source, previously characterized as outside the boundary of the enterprise corpus, within the same boundary of the enterprise corpus during the second crawl; and determining relevancy scoring for the other suggested links. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer program product embedded in a computer readable storage medium for automatically generating suggested links in a search system, comprising:
-
program code for initiating a first crawl across an enterprise corpus owned by an enterprise; program code for discovering during the first crawl a link pointing to a data source, the data source being mis-characterized during the first crawl as outside a boundary of the enterprise corpus owned by the enterprise; program code for automatically storing the link as a first suggested link with other suggested links in a memory; program code for initiating a second crawl across the enterprise corpus after the automatically storing, the second crawl having a different seed uniform resource locator (URL) or different boundary rules than the first crawl; program code for encountering during the second crawl the data source within the same boundary of the enterprise corpus; program code for removing the first suggested link from the other suggested links based on the encountering the data source, previously characterized as outside the boundary of the enterprise corpus, within the same boundary of the enterprise corpus during the second crawl; and program code for determining relevancy scoring for the other suggested links. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer system for automatically generating suggested links in a search system, comprising:
-
at least one or more processors; and a memory operatively coupled with the one or more processors, the at least one or more processors executing instructions set forth in a coputer program for; initiating a first crawl across an enterprise corpus owned by an enterprise; discovery during the first crawl a link pointing to a data source, the data source being mis-characterized during the first crawl as outside a boundary of the enterprise corpus owned by the enterprise; automatically storing the link as a first suggested link with other suggested links in a memory; initiating a second crawl across the enterprise corpus after the automatically storing, the second crawl having a different sed uniform resource locator (URL) or different boundary rules than the first crawl; encountering during the second crawl the data source atually within the same boundary of the enterprise corpus; removing the first suggested link from the other suggested links based on encountering the data source, previously characterized as outside the boundary of the enterprise corpus, within the same boundary of the enterprise corpus during the second crawl; and determining relevancy scoring for the other suggested links.
-
Specification