Methods and apparatus for extracting referential keys from a document
First Claim
1. A computer-readable medium storing a computer program, wherein the stored computer program contains instructions for extracting a referential key from a document image, the extracted referential key comprising a key type, a key location, and characters representing the extracted referential key, wherein the document image is derived from a scanner, the stored computer program comprising at least one code segment that:
- parses the document image to locate a first indicator, the located first indicator including at least one of a placement indicator, a format indicator, and a font indicator that indicates a found referential key within the document image;
determines if the located first indicator is determinative of a key type of the found referential key, without knowledge of text contained within the found referential key;
if the located first indicator is not determinative of the key type of the found referential key;
parses the found referential key to locate a second indicator, the located second indicator including at least one of a placement indicator, a format indicator, and a font indicator;
determines if a combination of the located first indicator and the located second indicator is determinative of the key type of the found referential key;
extracts characters from the found referential key using an optical character recognition routine to scan the found referential key once at least one of the located first indicator, and the combination of the located first indicator and the located second indicator, is found to be determinative of the key type; and
stores the key type of the found referential key, the key location of the found referential key, and the extracted characters of the found referential key representing the found referential key in a structured format according to the key type, the extracted characters, and the key location, wherein the structured format comprises computer-readable content allowing navigation to, from, and within the document image.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, computer-readable media, and systems for extracting referential keys from a document are provided. A document is parsed to identify at least one key, the key being identified from at least one contextual indication. The key is classified according to a key type, the key type being identified from the contextual indication. The key is extracted and then stored in a location in a structured shell with the location corresponding to the key type. As a result, the key can be found by a search seeking one of the key and the key type allowing a searcher to identify the document from which the key was extracted.
57 Citations
10 Claims
-
1. A computer-readable medium storing a computer program, wherein the stored computer program contains instructions for extracting a referential key from a document image, the extracted referential key comprising a key type, a key location, and characters representing the extracted referential key, wherein the document image is derived from a scanner, the stored computer program comprising at least one code segment that:
-
parses the document image to locate a first indicator, the located first indicator including at least one of a placement indicator, a format indicator, and a font indicator that indicates a found referential key within the document image; determines if the located first indicator is determinative of a key type of the found referential key, without knowledge of text contained within the found referential key; if the located first indicator is not determinative of the key type of the found referential key; parses the found referential key to locate a second indicator, the located second indicator including at least one of a placement indicator, a format indicator, and a font indicator; determines if a combination of the located first indicator and the located second indicator is determinative of the key type of the found referential key; extracts characters from the found referential key using an optical character recognition routine to scan the found referential key once at least one of the located first indicator, and the combination of the located first indicator and the located second indicator, is found to be determinative of the key type; and stores the key type of the found referential key, the key location of the found referential key, and the extracted characters of the found referential key representing the found referential key in a structured format according to the key type, the extracted characters, and the key location, wherein the structured format comprises computer-readable content allowing navigation to, from, and within the document image. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
Specification