Method and system for personal information extraction and modeling with fully generalized extraction contexts
First Claim
Patent Images
1. A method for modeling information implemented using a computer having a processor and a display from a set of documents, comprising:
- importing a set of concepts;
creating a model including the concepts;
based on a user selection of a first concept, extracting first information from the set of documents using an extractor corresponding to the first concept, the first information comprising a list of documents containing the first concept;
based on the first information, defining a context of the first concept comprising a contiguous block of text in proximity of the first concept in each document in the list of documents containing the first concept;
based on a user selection of a second concept, extracting second information from the set of documents, the second information comprising a list of documents containing the second concept in the defined context of the first concept;
adding a new additional concept to the model wherein the new additional concept represents the second concept in the defined context of the first concept; and
extracting information related to the new additional concept from the set of documents using an extractor corresponding to the new additional concept.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for modeling information from a set of documents are disclosed. A tool allows a user to model concepts of interest and extract information from a set of documents in an editable format. The extracted information includes a list of instances of a document from the set of documents that contains the selected concept. The user may modify the extracted information to create subsets of information, add new concepts to the model, and share the model with others.
75 Citations
24 Claims
-
1. A method for modeling information implemented using a computer having a processor and a display from a set of documents, comprising:
-
importing a set of concepts; creating a model including the concepts; based on a user selection of a first concept, extracting first information from the set of documents using an extractor corresponding to the first concept, the first information comprising a list of documents containing the first concept; based on the first information, defining a context of the first concept comprising a contiguous block of text in proximity of the first concept in each document in the list of documents containing the first concept; based on a user selection of a second concept, extracting second information from the set of documents, the second information comprising a list of documents containing the second concept in the defined context of the first concept; adding a new additional concept to the model wherein the new additional concept represents the second concept in the defined context of the first concept; and extracting information related to the new additional concept from the set of documents using an extractor corresponding to the new additional concept. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for personalizing a model of information implemented using a computer having a processor and a display, comprising:
-
obtaining a set of documents; receiving the model that contains concepts; extracting first information from the set of documents using an extractor corresponding to a first concept, the first information comprising a list of documents containing the first concept; based on the first information, defining a context of the first concept comprising a contiguous block of text in proximity of the first concept in each document in the list of documents containing the first concept; selecting a second concept to update; updating the second concept; extracting second information from the set of documents, the second information comprising a list of documents containing the updated second concept in the defined context of the first concept; adding a new additional concept to the model wherein the new additional concept represents the updated second concept in the defined context of the first concept; and extracting information related to the new additional concept from the set of documents using an extractor corresponding to the new additional concept. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A method for annotating a set of documents in a model of information implemented using a computer having a processor and a display, comprising:
-
downloading a set of documents; receiving the model, wherein the model contains concepts; extracting first information from the set of documents using an extractor corresponding to a first concept, the first information comprising a list of documents containing the first concept; calculating a status of a document in the set of documents; after extracting the first information, automatically annotating a document in the set of documents based on the calculation; based on the first information, defining a context of the first concept comprising a contiguous block of text in proximity of the first concept in each document in the list of documents containing the first concept; extracting second information from the set of documents, the second information comprising a list of documents containing a second concept in the defined context of the first concept; adding a new additional concept to the model wherein the new additional concept represents the second concept in the defined context of the first concept; and extracting information related to the new additional concept from the set of documents using an extractor corresponding to the new additional concept. - View Dependent Claims (18, 19, 20)
-
-
21. A method for modeling information implemented using a computer having a processor and a display from a set of documents, comprising:
-
importing a set of concepts and corresponding extractors; creating a model including a first concept and a second concept; establishing a definitional dependency between the first concept and the second concept; based on a user selection of the first concept, extracting first information from the set of documents using an extractor corresponding to the first concept, the first information comprising a list of documents containing the first concept; based on the first information, defining a context of the first concept comprising a contiguous block of text in proximity of the first concept in each document in the list of documents containing the first concept; after extracting information related to the first concept, receiving a modification of the second concept; after receiving the modification of the second concept, updating the first information using the definitional dependency between the first concept and the second concept; adding a new additional concept to the model wherein the new additional concept represents the updated first information using the definitional dependency between the first concept and the second concept; and based on a user selection of the new additional concept, extracting information related to the new additional concept from the set of documents using an extractor corresponding to the new additional concept. - View Dependent Claims (22, 23)
-
-
24. A system for modeling information from a set of documents, comprising:
-
a processor; an extraction engine configured in the processor to extract information from the set of documents; software components comprising; an importing component configured to import a set of concepts, a modeling component configured to create a model including the concepts, a first extraction component configured to cause the extraction engine to extract first information from the set of documents using an extractor corresponding to the first concept, the first information comprising a list of documents containing a first concept, a context component for, based on the first information, defining a context of the first concept comprising a contiguous block of text in proximity of the first concept in each document in the list of documents containing the first concept, a second extraction component configured to cause the extraction engine to extract second information from the set of documents, the second information comprising a list of documents containing a second concept in the defined context of the first concept, a context concept component configured to add a new additional concept to the model wherein the new additional concept represents the second concept in the defined context of the first concept, and a context extraction component configured to cause the extraction engine to extract information related to the new additional concept from the set of documents using an extractor corresponding to the new additional concept; and a database configured to store the model.
-
Specification