Plug-in parsers for configuring search engine crawler
First Claim
Patent Images
1. A method in a recordable-type media for configuring a search engine, the method comprising:
- providing a plug-in interface for a crawling search engine; and
providing a plurality of different plug-in parsers for use with the crawling search engine,wherein the plug-in interface allows the crawling search engine to be configured with different plug-in parsers,wherein the plurality of different plug-in parsers parse page data for window chunks using different parsing algorithms.
1 Assignment
0 Petitions
Accused Products
Abstract
A plug-in interface is provided in a crawling search engine. Plug-in parsers are also provided for use with the search engine. The plug-in interface allows the search engine to be configured with different plug-in parsers. Thus, a customer may configure a search engine with a parser that best suits the needs of the customer and to try new parsing algorithms to find the best results.
-
Citations
18 Claims
-
1. A method in a recordable-type media for configuring a search engine, the method comprising:
-
providing a plug-in interface for a crawling search engine; and providing a plurality of different plug-in parsers for use with the crawling search engine, wherein the plug-in interface allows the crawling search engine to be configured with different plug-in parsers, wherein the plurality of different plug-in parsers parse page data for window chunks using different parsing algorithms. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method in a recordable-type media for searching documents, the method comprising:
-
loading a plug-in parser by a crawling search engine from a plurality of different plug-in parsers for use with the crawling search engine via a plug-in interface for the crawling search engine, wherein the plurality of different plug-in parsers parses page data for window chunks using different parsing algorithms; obtaining a list of pages; obtaining page data for the pages; using the plug-in parser to parse the page data for a set of window chunks; and recursively crawling anchors in the set of window chunks. - View Dependent Claims (7, 8, 9)
-
-
10. An apparatus for searching documents, comprising:
-
a crawling search engine, wherein the crawling search engine includes a plug-in interface that allows the crawling search engine to be configured with different plug-in parsers; a memory that contains a set of instructions; a plurality of different plug-in parsers for use with the crawling search engine, wherein the plurality of different plug-in parsers parse page data for window chunks using different parsing algorithms; and a processing unit, responsive to execution of the set of instructions, loading from the crawling search engine one of the the plurality of different plug-in parsers via the plug-in interface and parsing with the loaded plug-in parser page data for window chunks. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A recordable-type medium stored thereon computer usable program code for configuring a search engine, the computer usable program code, when executed by a computer, causes the computer to perform:
-
providing a plug-in interface for a crawling search engine; and providing a plurality of different plug-in parsers for use with the crawling search engine, wherein the plug-in interface allows the crawling search engine to be configured with different plug-in parsers, wherein the plurality of different plug-in parsers parse page data for window chunks using different parsing algorithms. - View Dependent Claims (16)
-
-
17. A recordable-type medium stored thereon computer usable program code for searching documents, the computer usable program code, when executed by a computer, causes the computer to perform:
-
loading a plug-in parser by a crawling search engine from a plurality of different plug-in parsers for use with the crawling search engine via a plug-in interface for the crawling search engine, wherein the plurality of different plug-in parsers parse page data for window chunks using different parsing algorithms; obtaining a list of pages; obtaining page data for the pages; using the plug-in parser to parse the page data for window chunks; and recursively crawling anchors in the window chunks.
-
-
18. A method in a recordable-type media for configuring a search engine, the method comprising:
-
providing a plug-in interface for a crawling search engine; providing a plurality of different plug-in parsers for use with the crawling search engine; wherein the plug-in interface allows the crawling search engine to be configured with different plug-in parsers, wherein the plurality of different plug-in parsers parse page data for window chunks using different parsing algorithms; wherein the crawling search engine loads one of the plurality of different plug-in parsers via the plug-in interface; wherein the loaded plug-in parser performs the step of parsing the page data before the crawling search engine performs other actions relative to the page data.
-
Specification