Extensible surface for consuming information extraction services
First Claim
1. In a computing environment, a method of representing structured data extracted from unstructured data in a fashion which allows querying using relational database concepts, the method comprising:
- receiving user input specifying one or more database views;
receiving user input specifying an information extraction technique, the information extraction technique defining how to extract structured data from unstructured data and the information extraction technique comprising a phrase semantic extraction technique which determines a semantic relationship about one or more words based upon a semantic environment of the one or more words;
receiving user input specifying a corpus of data comprising unstructured data, the unstructured data comprising data that is not organized semantically such that it does not have a formalized type and is not in a formal entity level relationship; and
applying the extraction technique to the corpus of data to extract structured data from the unstructured data of the corpus of data and to produce the one or more database views including the extracted structured data.
2 Assignments
0 Petitions
Accused Products
Abstract
Representing structured data extracted from unstructured data in fashion allowing querying using relational database concepts. A method includes receiving user input specifying one or more database views. The method further includes receiving user input specifying an information extraction technique, such as an extraction workflow. The method further includes receiving user input specifying a corpus of data. The extraction technique is applied to the corpus of data to produce the one or more database views. These views can then be queried or operated on using database tools.
36 Citations
19 Claims
-
1. In a computing environment, a method of representing structured data extracted from unstructured data in a fashion which allows querying using relational database concepts, the method comprising:
-
receiving user input specifying one or more database views; receiving user input specifying an information extraction technique, the information extraction technique defining how to extract structured data from unstructured data and the information extraction technique comprising a phrase semantic extraction technique which determines a semantic relationship about one or more words based upon a semantic environment of the one or more words; receiving user input specifying a corpus of data comprising unstructured data, the unstructured data comprising data that is not organized semantically such that it does not have a formalized type and is not in a formal entity level relationship; and applying the extraction technique to the corpus of data to extract structured data from the unstructured data of the corpus of data and to produce the one or more database views including the extracted structured data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. One or more physical computer readable storage media comprising computer executable instructions that when executed by one or more processors cause the following to be performed:
-
accept user input specifying one or more database views; accept user input specifying an information extraction technique, the information extraction technique defining how to extract structured data from unstructured data and the information extraction technique comprising a phrase semantic extraction technique which determines a semantic relationship about one or more words based upon a semantic environment of the one or more words; accept user input specifying a corpus of data comprising unstructured data, the unstructured data comprising data that is not organized semantically such that it does not have a formalized type and is not in a formal entity level relationship; and apply the extraction technique to the corpus of data to extract structured data from the unstructured data of the corpus of data and to produce the one or more database views including the extracted structured data. - View Dependent Claims (16, 17, 18)
-
-
19. A system for representing structured data extracted from unstructured data in a fashion which allows querying using relational database concepts, the system comprising one or more processors and one or more physical computer readable storage media having encoded thereon computer executable instructions which, when executed upon the one or more processors, cause the system to:
-
store one or more corpuses of data, each of the one or more corpuses of data comprising unstructured data; receive user input specifying one or more database views; receive user input specifying an information extraction technique, the information extraction technique defining how to extract structured data from unstructured data and the information extraction technique comprising a phrase semantic extraction technique which determines a semantic relationship about one or more words based upon a semantic environment of the one or more words; receive user input specifying one or more of the stored one or more corpuses of data comprising unstructured data, the unstructured data comprising data that is not organized semantically such that it does not have a formalized type and is not in a formal entity level relationship; and apply the extraction technique to the specified one or more corpuses of data to extract structured data from the unstructured data of the corpuses of data and to produce the one or more database views including the extracted structured data.
-
Specification