Document information mining tool
First Claim
1. A method for extracting referential keys from a document, the method comprising:
- parsing the document to identify at least one key, the key being identified from at least one indicator;
classifying the key according to a key type, the key type being identified from the at least one indicator;
extracting the key; and
storing the key in a location in a structured shell, the location corresponding to the key type such that the key can be found by a search seeking at least one of the key and the key-type allowing a searcher to identify the document from which the key was extracted.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the present invention provide methods, computer-readable media, and systems for extracting referential keys from a document. A document is parsed to identify at least one key, the key being identified from at least one contextual indication. The key is classified according to a key type, the key type being identified from the contextual indication. The key is extracted and then stored in a location in a structured shell with the location corresponding to the key type. As a result, the key can be found by a search seeking one of the key and the key-type allowing a searcher to identify the document from which the key was extracted.
70 Citations
69 Claims
-
1. A method for extracting referential keys from a document, the method comprising:
-
parsing the document to identify at least one key, the key being identified from at least one indicator;
classifying the key according to a key type, the key type being identified from the at least one indicator;
extracting the key; and
storing the key in a location in a structured shell, the location corresponding to the key type such that the key can be found by a search seeking at least one of the key and the key-type allowing a searcher to identify the document from which the key was extracted. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for extracting referential keys from a document, the method comprising:
-
parsing a .pdf document to identify at least one key, the key being identified from at least one indicator, the indicator including at least one of a placement indicator, a format indicator, and a font indicator;
classifying the key according to a key type, the key type being identified from the at least one indicator;
extracting the key; and
storing the key in a location in an extensible markup language (XML) document, the location corresponding to the key type such that the key can be found by a search seeking at least one of the key and the key-type allowing a searcher to identify the document from which the key was extracted. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A computer-readable medium having stored thereon instructions for extracting referential keys from a document, the computer-readable medium comprising:
-
a first computer program portion adapted to parse the document to identify at least one key, the key being identified from at least one indicator;
a second computer program portion adapted to classify the key according to a key type, the key type being identified from the at least one indicator;
a third computer program portion adapted to extract the key; and
a fourth computer program portion adapted to store the key in a location in a structured shell, the location corresponding to the key type such that the key can be found by a search seeking one of the key and the key-type allowing a searcher to identify the document from which the key was extracted. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A computer-readable medium for extracting referential keys from a document, the computer-readable medium comprising:
-
a first computer program portion adapted to parse a .pdf document to identify at least one key, the key being identified from at least one indicator, the indicator including at least one of a placement indicator, a format indicator, and a font indicator;
a second computer program portion adapted to classify the key according to a key type, the key type being identified from the at least one indicator;
a third computer program portion adapted to extract the key; and
a fourth computer program portion adapted to store the key in a location in an extensible markup language (XML) document, the location corresponding to the key type such that the key can be found by a search seeking one of the key and the key-type allowing a searcher to identify the document from which the key was extracted. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42)
-
-
43. A system for extracting referential keys from a document, the system comprising:
-
a parser configured to parse the document to identify at least one key, the key being identified from at least one indicator;
an identifier configured to identify the key according to a key type, the key type being identified from the at least one indicator; and
a classifier configured to extract the key and store the key in a location in a structured shell, the location corresponding to the key type such that the key can be found by a search seeking one of the key and the key-type allowing a searcher to identify the document from which the key was extracted. - View Dependent Claims (44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54)
-
-
55. A system for extracting referential keys from a document, the system comprising:
-
a parser configured to parse a .pdf document to identify at least one key, the key being identified from at least one indicator, the indicator including at least one of a placement indicator, a format indicator, and a font indicator;
an identifier configured to identify the key according to a key type, the key type being identified from the at least one indicator;
a classifier configured to extract the key and store the key in a location in an extensible markup language (XML) document, the location corresponding to the key type such that the key can be found by a search seeking one of the key and the key-type allowing a searcher to identify the document from which the key was extracted. - View Dependent Claims (56, 57, 58, 59, 60, 61, 62, 63)
-
-
64. A method of information searching in a document, comprising:
-
creating a reference key corresponding to at least one contextual indicator present in the document;
parsing successive portions of the document;
identifying at least one portion of the document that includes the reference key; and
reviewing the information included in the at least one portion. - View Dependent Claims (65, 66, 67, 68, 69)
-
Specification