BUILDING AND MAINTAINING INFORMATION EXTRACTION RULES
First Claim
1. A method comprising:
- opening one or more documents for extraction;
providing an interface to create a label and thereupon label a portion of the document;
storing the created label;
developing an extractor based on the labeling;
providing a test interface for the extractor;
displaying results of a test conducted through the test interface; and
exporting the extractor.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and arrangements for managing development of information extraction rules. One or more documents are opened for extraction. An interface is provided to create a label and thereupon label a portion of the document. The created label is stored, and an extractor is developed based on the labeling. A test interface is provided for the extractor, and results of a test conducted through the test interface are displayed. The extractor is exported. In accordance with at least one embodiment, developers are presented with eased automated guidance to write extractors, which thereby reduces an overall manual effort involved in extractor development. Generally, a focused, tutorial-type environment serves as a guide based on previously developed best practices.
6 Citations
20 Claims
-
1. A method comprising:
-
opening one or more documents for extraction; providing an interface to create a label and thereupon label a portion of the document; storing the created label; developing an extractor based on the labeling; providing a test interface for the extractor; displaying results of a test conducted through the test interface; and exporting the extractor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. An apparatus comprising:
-
at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising; computer readable program code configured to open one or more documents for extraction; computer readable program code configured to provide an interface to create a label and thereupon label a portion of the document; computer readable program code configured to store the created label; computer readable program code configured to develop an extractor based on the labeling; computer readable program code configured to provide a test interface for the extractor; computer readable program code configured to display results of a test conducted through the test interface; and computer readable program code configured to export the extractor.
-
-
19. A computer program product comprising:
-
a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising; computer readable program code configured to open one or more documents for extraction; computer readable program code configured to provide an interface to create a label and thereupon label a portion of the document; computer readable program code configured to store the created label; computer readable program code configured to develop an extractor based on the labeling; computer readable program code configured to provide a test interface for the extractor; computer readable program code configured to display results of a test conducted through the test interface; and computer readable program code configured to export the extractor. - View Dependent Claims (20)
-
Specification