Semantic relationship extraction, text categorization and hypothesis generation
First Claim
1. A hypothesis generation system, comprising:
- a communications module configured to communicate with a remote database, wherein the communications module facilitates the downloading of data from the remote database;
a hypothesis generation module that is configured to;
pre-compute co-occurring concepts included in the data prior to receiving a command to generate a hypothesis;
generate a hypothesis based on the co-occurring concepts in response to receiving a hypothesis generation command; and
transform the hypothesis into user-comprehendible results;
a storage module configured to store the co-occurring concepts while awaiting the hypothesis generation command;
a user input module configured to receive a user input and generate the hypothesis generation command; and
a user interface module configured to present the user-comprehendible results on a display screen.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the present invention are generally related to systems, methods, computer readable media, and other means for extracting entities, determining the semantic relationships among the entities and generating knowledge. More particularly, some embodiments of the present invention are directed to generating a hypothesis and/or gaining knowledge by automatically extracting semantic relationships from electronic documents stored at remote databases, which are accessed over the Internet. Some embodiments of the present invention also include textual categorization and new approaches and algorithms for solving the problems of relationship extraction, computing semantic relationships and generating hypotheses.
-
Citations
19 Claims
-
1. A hypothesis generation system, comprising:
-
a communications module configured to communicate with a remote database, wherein the communications module facilitates the downloading of data from the remote database; a hypothesis generation module that is configured to; pre-compute co-occurring concepts included in the data prior to receiving a command to generate a hypothesis; generate a hypothesis based on the co-occurring concepts in response to receiving a hypothesis generation command; and transform the hypothesis into user-comprehendible results; a storage module configured to store the co-occurring concepts while awaiting the hypothesis generation command; a user input module configured to receive a user input and generate the hypothesis generation command; and a user interface module configured to present the user-comprehendible results on a display screen.
-
-
2. A method of generating hypotheses, comprising:
-
communicating with a remote database; downloading data from the remote database; pre-computing co-occurring concepts that are included in the data, wherein the pre-computing occurs prior to receiving a user input that is associated with a command to generate one or more hypotheses; receiving the user input that is associated with the command to generate the one or more hypotheses; in response to a hypothesis generation engine receiving the command, generating at least one hypothesis based on the co-occurring concepts; transforming the at least one hypothesis into user-comprehendible results; and presenting the user-comprehendible results on a display screen.
-
-
3. A computer program product comprising a computer-readable storage medium having computer-readable program code portions stored therein and providing for hypothesis generation, the computer program product comprising:
-
a first program code portion configured for communicating with a remote database; a second program code portion configured for downloading data from the remote database; a third program code portion configured for pre-computing co-occurring concepts that are included in the data, wherein the pre-computing occurs prior to receiving a user input that is associated with a command to generate one or more hypotheses; a fourth program code portion configured for receiving the user input that is associated with the command to generate the one or more hypotheses; a fifth program code portion configured for, in response to a hypothesis generation engine receiving the command, generating at least one hypothesis based on the co-occurring concepts; a sixth program code portion configured for transforming the at least one hypothesis into user-comprehendible results; and a seventh program code portion configured for presenting the user-comprehendible results on a display screen.
-
-
4. A method of generating a hypothesis comprising:
-
extracting concept A, concept B, and concept C from a data set, wherein the extracting is performed by an entity extraction module specifically programmed to extract concepts from the data set; receiving the concept A, the concept B, and the concept C as inputs to a hypothesis generation module, wherein the hypothesis generation module is specifically programmed to generate a hypothesis; receiving as an input, at the hypothesis generation module, a semantic category associated with each of the concept A, the concept B,and the concept C; and outputting, from the hypothesis generation module, a hypothesis indicating an association among the concept A, the concept B, and the concept C, in response to the hypothesis generation module determining; that the concept A relates to the concept B; that the concept B relates to the concept C; that the concept A relates to the concept C; and that the semantic category associated with each of the concept A, the concept B and the concept C is the same type of semantic category.
-
-
5. A method of generating a hypothesis comprising:
-
extracting concept A and concept C from a data set, wherein the extracting is performed by an entity extraction module specifically programmed to extract concepts from the data set; receiving the concept A and the concept C as inputs to a hypothesis generation module, wherein the hypothesis generation module is specifically programmed to generate a hypothesis; determining a concept A semantic category associated with the concept A; determining a concept C semantic category associated with the concept C; and outputting, from the hypothesis generation module, a hypothesis that the concept A is a substitute for the concept C, in response to the hypothesis generation module determining; that the concept A semantic category is the same semantic category as the concept C semantic category; that the concept A and the concept C are co-occurring throughout the data set; and that the concept A and the concept C have at least a predetermined quantity of commonly associated ancillary concepts.
-
-
6. A method of extracting semantic relationships among entities in a sentence, comprising:
-
accessing at least one data set from a remote storage device specifically programmed to store data that represents documents containing at least one association between two or more entities; identifying a first sentence included in the at least one data set; extracting entities from the first sentence using an extraction engine specifically programmed to extract entities from a sentence; determining a semantic type associated with each of the entities; determining a relationship that exists between two or more of the entities; generating a subsequence that includes one or more semantic types and at least one verb, wherein the one or more semantic types include the semantic type associated with each of the two or more of the entities; calculating a weight value of the subsequence; and storing the subsequence, the weight value, the semantic type associated with each of the entities, and the relationship in a storage device that is specifically programmed to store the subsequence, the weight value, the semantic type associated with each of the entities, and the relationship. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. A system that extracts semantic relationships among entities in a sentence, comprising:
-
a communications module configured to communicate with a remote storage device configured to store data that represents documents containing at least one association between two or more entities, wherein the communications module facilitates the downloading of at least one data set from the remote storage device; a local storage device configured to store the at least one data set; an entity extraction module configured to; identify a first sentence included in the at least one data set; and extract entities from the first sentence; and a semantic relationship extraction module configured to; determine a semantic type associated with each of the entities; determine a relationship that exists between two or more of the entities; generate a subsequence that includes one or more semantic types and at least one verb, wherein the one or more semantic types include the semantic type associated with each of the two or more of the entities; and calculate a weight value for the subsequence; and the local storage device is configured to store the subsequence, the weight value, the semantic type associated with each of the entities, and the relationship. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
Specification