Relational text index creation and searching
First Claim
Patent Images
1. A computer program product located to one or more storage media devices usable to perform thematic role data mining on a set of documents, said computer program product comprising computer readable instructions executable by a computer to perform the functions of:
- reading a relational text index, said index including thematic role information and corresponding document location information, the thematic role information being a product of thematic role extraction, the document location information including references to source documents containing the natural language text sourced for the thematic role extraction;
performing data mining analytic processing on thematic role information read from the relational text index in said reading, said processing identifying common events or attributes in the thematic role information of the relational text index as a product; and
providing the common event or attribute product for further processing or display to a user, wherein said computer readable instructions are further executable to utilize sentence information to build a relational text index readable by said reading, the sentence information including thematic role information;
wherein said computer readable instructions are further executable to perform thematic role assignment on caseframe extractions to generate thematic role extraction information suitable for inclusion into a relational text index; and
wherein said computer readable instructions are further executable to perform unification of the thematic role extraction information.
1 Assignment
0 Petitions
Accused Products
Abstract
In an environment where it is desire to perform information extraction over a large quantity of textual data, methods, tools and structures are provided for building a relational text index from the textual data and performing searches using the relational text index.
-
Citations
11 Claims
-
1. A computer program product located to one or more storage media devices usable to perform thematic role data mining on a set of documents, said computer program product comprising computer readable instructions executable by a computer to perform the functions of:
-
reading a relational text index, said index including thematic role information and corresponding document location information, the thematic role information being a product of thematic role extraction, the document location information including references to source documents containing the natural language text sourced for the thematic role extraction;
performing data mining analytic processing on thematic role information read from the relational text index in said reading, said processing identifying common events or attributes in the thematic role information of the relational text index as a product; and
providing the common event or attribute product for further processing or display to a user, wherein said computer readable instructions are further executable to utilize sentence information to build a relational text index readable by said reading, the sentence information including thematic role information;
wherein said computer readable instructions are further executable to perform thematic role assignment on caseframe extractions to generate thematic role extraction information suitable for inclusion into a relational text index; and
wherein said computer readable instructions are further executable to perform unification of the thematic role extraction information. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
parse natural language sentences contained in a set of source documents; and
apply caseframes to the parsed natural language sentences to generate caseframe extractions.
-
-
7. The computer program product of claim 6, wherein said computer readable instructions are further executable to convert a set of sourced documents to a set of formatted documents, the formatted documents having a format suitable for said parsing.
-
8. The computer program product of claim 6, wherein said computer readable instructions are further executable to collect documents from a source and present the collected documents for said converting.
-
9. The computer program product of claim 1, wherein said computer readable instructions are executable to read a relational text index containing thematic role information, the location of documents, and locations within the documents corresponding to thematic role information.
-
10. A computer program product located to one or more storage media devices usable to perform thematic role data mining on a set of documents, said computer program product comprising computer readable instructions executable by a computer to perform the functions of:
-
reading a relational text index, said index including thematic role information and corresponding document location information, the thematic role information being a product of thematic role extraction, the document location information including references to source documents containing the natural language text sourced for the thematic role extraction;
performing data mining analytic processing on thematic role information read from the relational text index in said reading, said processing identifying common events or attributes in the thematic role information of the relational text index as a product; and
providing the common event or attribute product for further processing or display to a user, wherein said computer readable instructions are further executable to utilize sentence information to build a relational text index readable by said reading, the sentence information including thematic role information;
wherein said computer readable instructions are further executable to perform thematic role assignment on caseframe extractions to generate thematic role extraction information suitable for inclusion into a relational text index; and
wherein said computer readable instructions are further executable to perform subject-specific conceptual role assignment.
-
-
11. A computer program product located to one or more storage media devices usable to perform thematic role data mining on a set of documents, said computer program product comprising computer readable instructions executable by a computer to perform the functions of:
-
reading a relational text index, said index including thematic role information and corresponding document location information, the thematic role information being a product of thematic role extraction, the document location information including references to source documents containing the natural language text sourced for the thematic role extraction;
performing data mining analytic processing on thematic role information read from the relational text index in said reading, said processing identifying common events or attributes in the thematic role information of the relational text index as a product; and
providing the common event or attribute product for further processing or display to a user, wherein said computer readable instructions are further executable to utilize sentence information to build a relational text index readable by said reading, the sentence information including thematic role information;
wherein said computer readable instructions are further executable to perform thematic role assignment on caseframe extractions to generate thematic role extraction information suitable for inclusion into a relational text index; and
wherein said computer readable instructions are further executable to;
parse natural language sentences contained in a set of source documents; and
apply caseframes to the parsed natural language sentences to generate caseframe extractions.
-
Specification