Plug-in parsers for configuring search engine crawler
First Claim
Patent Images
1. A method for configuring a search engine, comprising:
- providing a plug-in interface for a crawling search engine; and
providing at least one plug-in parser for the crawling search engine, wherein the at least one plug-in parser parses page data for window chunks and wherein the crawling search engine loads one of the at least one plug-in parser via the plug-in interface.
1 Assignment
0 Petitions
Accused Products
Abstract
A plug-in interface is provided in a crawling search engine. Plug-in parsers are also provided for use with the search engine. The plug-in interface allows the search engine to be configured with different plug-in parsers. Thus, a customer may configure a search engine with a parser that best suits the needs of the customer and to try new parsing algorithms to find the best results.
56 Citations
21 Claims
-
1. A method for configuring a search engine, comprising:
-
providing a plug-in interface for a crawling search engine; and
providing at least one plug-in parser for the crawling search engine, wherein the at least one plug-in parser parses page data for window chunks and wherein the crawling search engine loads one of the at least one plug-in parser via the plug-in interface. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for searching documents, comprising:
-
loading a plug-in parser;
obtaining a list of pages;
obtaining page data for the pages;
using the plug-in parser to parse the page data for a set of window chunks; and
recursively crawling anchors in the set of window chunks. - View Dependent Claims (7, 8, 9)
-
-
10. An apparatus for searching documents, comprising:
-
a crawling search engine, wherein the crawling search engine includes a plug-in interface; and
at least one plug-in parser for the crawling search engine, wherein the at least one plug-in parser parses page data for window chunks and wherein the crawling search engine loads one of the at least one plug-in parser via the plug-in interface. - View Dependent Claims (11, 12, 13, 14)
-
-
15. An apparatus for searching documents, comprising:
-
loading means for loading a plug-in parser;
list means for obtaining a list of pages;
page means for obtaining page data for the pages;
parsing means for using the plug-in parser to parse the page data for window chunks; and
crawling means for recursively crawling anchors in the window chunks. - View Dependent Claims (16, 17, 18)
-
-
19. A computer program product, in a computer readable medium, for configuring a search engine, comprising:
-
instructions for providing a plug-in interface for a crawling search engine; and
instructions for providing at least one plug-in parser for the crawling search engine, wherein the at least one plug-in parser parses page data for window chunks and wherein the crawling search engine loads one of the at least one plug-in parser via the plug-in interface. - View Dependent Claims (20)
-
-
21. A computer program product, in a computer readable medium, for searching documents, comprising:
-
instructions for loading a plug-in parser;
instructions for obtaining a list of pages;
instructions for obtaining page data for the pages;
instructions for using the plug-in parser to parse the page data for window chunks; and
instructions for recursively crawling anchors in the window chunks.
-
Specification