SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR DATA MINING AND AUTOMATICALLY GENERATING HYPOTHESES FROM DATA REPOSITORIES
First Claim
1. A method for generating a hypothesis, the method comprising:
- accessing a system for extracting relationships, the system for extracting relationships comprising a plurality of phrases and a plurality of concepts;
determining a relationship rule defining a relationship among at least a portion of the plurality of phrases and at least a portion of the plurality of concepts;
parsing a plurality of documents in a data repository according to the relationship rule, the plurality of documents each comprising at least a portion of one of the plurality of phrases and the plurality of concepts;
generating a hypothesis comprising a previously unknown combination, the previously unknown combination including one of at least one of the plurality of phrases and at least one of the plurality of concepts, the previously unknown combination being at least partially determined from the parsed plurality of documents; and
presenting the hypothesis so as to indicate the previously unknown combination.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments of the present invention provide systems, methods, and computer programs for generating a hypothesis. Specifically, some method embodiments include steps for accessing a system for extracting relationships and determining a relationship rule defining a relationship among a plurality of phrases and a plurality of concepts stored in the system for extracting relationships. Such embodiments further provide steps for parsing a plurality of documents in a data repository according to the relationship rule and generating a hypothesis comprising a previously unknown combination of phrases and concepts being at least partially determined from the parsed plurality of documents. Various embodiments also provide a step for presenting the hypothesis to a user so as to indicate the previously unknown combination.
39 Citations
44 Claims
-
1. A method for generating a hypothesis, the method comprising:
-
accessing a system for extracting relationships, the system for extracting relationships comprising a plurality of phrases and a plurality of concepts; determining a relationship rule defining a relationship among at least a portion of the plurality of phrases and at least a portion of the plurality of concepts; parsing a plurality of documents in a data repository according to the relationship rule, the plurality of documents each comprising at least a portion of one of the plurality of phrases and the plurality of concepts; generating a hypothesis comprising a previously unknown combination, the previously unknown combination including one of at least one of the plurality of phrases and at least one of the plurality of concepts, the previously unknown combination being at least partially determined from the parsed plurality of documents; and presenting the hypothesis so as to indicate the previously unknown combination. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A computer program product for generating a hypothesis based on a plurality of documents in a data repository in a manner that reduces the burden on the data repository, said computer program product comprising a computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising:
-
a first set of computer instructions for accessing a system for extracting relationships, the system for extracting relationships comprising a plurality of phrases and a plurality of concepts; a second set of computer instructions for determining a relationship rule defining a relationship among at least a portion of the plurality of phrases and at least a portion of the plurality of concepts; a third set of computer instructions for parsing the plurality of documents in the data repository according to the relationship rule, the plurality of documents each comprising at least a portion of one of the plurality of phrases and the plurality of concepts; a fourth set of computer instructions for generating a hypothesis comprising a previously unknown combination, the previously unknown combination including one of at least one of the plurality of phrases and at least one of the plurality of concepts, the previously unknown combination being at least partially determined from the parsed plurality of documents; and a fifth set of computer instructions for presenting the hypothesis so as to indicate the previously unknown combination. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
-
-
35. A system for mining information from a data repository comprising a plurality of documents to produce a hypothesis, the system comprising:
-
a system for extracting relationships comprising a plurality of phrases and a plurality of concepts; a host computing element in communication with said system for extracting relationships for accessing said system for extracting relationships; wherein said host computing element determines a relationship rule defining a relationship among at least a portion of the plurality of phrases and at least a portion of the plurality of concepts; wherein said host computing element parses the plurality of documents in a data repository according to the relationship rule, the plurality of documents each comprising at least a portion of one of the plurality of phrases and the plurality of concepts; and wherein said host computing element generates the hypothesis comprising a previously unknown combination, the previously unknown combination including one of at least one of the plurality of phrases and at least one of the plurality of concepts, the previously unknown combination being at least partially determined from the parsed plurality of documents; and a user interface in communication with said host computing element, said user interface configured for presenting the hypothesis so as to indicate the previously unknown combination. - View Dependent Claims (36, 37, 38, 39, 40, 41, 42, 43, 44)
-
Specification