Form-based ontology creation and information harvesting
First Claim
1. In a computing environment, a method of extracting data from web pages and organizing the extracted data in a user searchable format, the method comprising:
- at a graphical user interface, receiving user input defining a tabular form;
at the graphical user interface, receiving user input correlating one or more portions of the form with one or more user selected data items contained in one or more first web pages;
a computer module correlating the user input to create an ontology defining relationships between the user selected data items based on the definition of the tabular form;
a computer module accessing one or more other web pages, and based on a context of the one or more data items in the first web page being similar to a context of the selected data items in the one or more first web pages, extracting one or more similar data items from the one or more other web pages;
a computer module correlating the extracted data items to each other in accordance with the ontology defining relationships between the user selected data items;
a computer module outputting the correlated extracted data items as a user searchable data structure.
2 Assignments
0 Petitions
Accused Products
Abstract
Extracting data from web pages. User input is received defining a tabular form. User input is received correlating portions of the form with user selected data items contained in one or more first web pages. The user input is correlated to create an ontology defining relationships between the user selected data items based on the definition of the tabular form. One or more other web pages are accessed, and based on a context of the one or more data items in the first web page being similar to a context of the selected data items in the one or more first web pages, one or more similar data items are extracted from the one or more other web pages. The extracted data items are correlated to each other in accordance with the ontology defining relationships between the user selected data items and are output as a user searchable data structure.
-
Citations
19 Claims
-
1. In a computing environment, a method of extracting data from web pages and organizing the extracted data in a user searchable format, the method comprising:
-
at a graphical user interface, receiving user input defining a tabular form; at the graphical user interface, receiving user input correlating one or more portions of the form with one or more user selected data items contained in one or more first web pages; a computer module correlating the user input to create an ontology defining relationships between the user selected data items based on the definition of the tabular form; a computer module accessing one or more other web pages, and based on a context of the one or more data items in the first web page being similar to a context of the selected data items in the one or more first web pages, extracting one or more similar data items from the one or more other web pages; a computer module correlating the extracted data items to each other in accordance with the ontology defining relationships between the user selected data items; a computer module outputting the correlated extracted data items as a user searchable data structure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. In a computing environment, a system for extracting data from web pages and organizing the extracted data in a user searchable format, the system comprising:
-
one or more processors; a graphical user interface, wherein the graphical user interface; receives user input defining a tabular form; and
receives user input correlating one or more portions of the form with one or more user selected data items contained in one or more first web pages;a first computer module implemented using computer executable instructions executed by one or more processors, wherein the first computer module correlates the user input received at the graphical user interface to create an ontology defining relationships between the user selected data items based on the definition of the tabular form; a second computer module implemented using computer executable instructions executed by one or more processors, wherein the second computer module accesses one or more other web pages, and based on a context of the one or more data items in the first web page being similar to a context of the selected data items in the one or more first web pages, extracts one or more similar data items from the one or more other web pages; a third computer module implemented using computer executable instructions executed by one or more processors, wherein the second computer module correlates the extracted data items to each other in accordance with the ontology defining relationships between the user selected data items; and a fourth computer module implemented using computer executable instructions executed by one or more processors, wherein the fourth computer module outputs the correlated extracted data items as a user searchable data structure.
-
Specification