Method and system for personal information extraction and modeling with fully generalized extraction contexts
First Claim
Patent Images
1. A method for extracting information from a set of documents, the method comprising:
- receiving a set of documents;
receiving a first concept, the first concept representing first information for extraction from the set of documents, wherein the first concept comprises a plurality of trigger phrases related to the first concept;
updating the first concept into a first subset and a second subset, the first subset comprising one or more of the plurality of trigger phrases of the first concept, the second subset comprising remaining trigger phrases of the plurality of trigger phrases of the first concept;
creating a second concept based on the first concept, the second concept having a definitional dependency from the first concept, the definitional dependency providing a contextual relationship to the second concept such that the second concept represents second information within a context of the first information;
updating the second concept into a third concept and a fourth concept, the third concept depending on the first subset of the first concept, the fourth concept depending on the second subset of the first concept;
selecting the second concept; and
based on the selection of the second concept, extracting from the set of documents, the second information within the context of the first information.
0 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for modeling information from a set of documents are disclosed. A tool allows a user to model concepts of interest and extract information from a set of documents in an editable format. The extracted information includes a list of instances of a document from the set of documents that contains the selected concept. The user may modify the extracted information to create subsets of information, add new concepts to the model, and share the model with others.
-
Citations
17 Claims
-
1. A method for extracting information from a set of documents, the method comprising:
-
receiving a set of documents; receiving a first concept, the first concept representing first information for extraction from the set of documents, wherein the first concept comprises a plurality of trigger phrases related to the first concept; updating the first concept into a first subset and a second subset, the first subset comprising one or more of the plurality of trigger phrases of the first concept, the second subset comprising remaining trigger phrases of the plurality of trigger phrases of the first concept; creating a second concept based on the first concept, the second concept having a definitional dependency from the first concept, the definitional dependency providing a contextual relationship to the second concept such that the second concept represents second information within a context of the first information; updating the second concept into a third concept and a fourth concept, the third concept depending on the first subset of the first concept, the fourth concept depending on the second subset of the first concept; selecting the second concept; and based on the selection of the second concept, extracting from the set of documents, the second information within the context of the first information. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system for extracting information from a set of documents, comprising:
-
a memory; and a processor configured to; receive a set of documents; receive a first concept, the first concept representing first information for extraction from the set of documents, wherein the first concept comprises a plurality of trigger phases related to the first concept; update the first concept into a first subset and a second subset, the first subset comprising one or more of the plurality of trigger phrases of the first concept, the second subset comprising remaining trigger phrases of the plurality of trigger phrases of the first concept; create a second concept based on the first concept, the second concept having a definitional dependency from the first concept, the definitional dependency providing a contextual relationship to the second concept such that the second concept represents second information within a context of the first information; update the second concept into a third concept and a fourth concept, the third concept depending on the first subset of the first concept, the fourth concept depending on the second subset of the first concept; select the second concept; and based on the selection of the second concept, extract from the set of documents, the second information within the context of the first information. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A non-transitory computer-readable medium encoded with instructions which, when executed by a computer, perform a method for extracting information from a set of documents, the method comprising:
-
receiving a set of documents; receiving a first concept, the first concept representing first information for extraction from the set of documents, wherein the first concept comprises a plurality of trigger phrases related to the first concept; updating the first concept into a first subset and a second subset, the first subset comprising one or more of the plurality of trigger phrases of the first concept, the second subset comprising remaining trigger phrases of the plurality of trigger phrases of the first concept; creating a second concept based on the first concept, the second concept having a definitional dependency from the first concept, the definitional dependency providing a contextual relationship to the second concept such that the second concept represents second information within a context of the first information; updating the second concept into a third concept a fourth concept, the third concept depending on the first subset of the first concept, the fourth concept depending on the second subset of the first concept; selecting the second concept; and based on the selection of the second concept, extracting from the set of documents, the second information within the context of the first information. - View Dependent Claims (17)
-
Specification