Apparatus, method and programmable product for identification of a document with feature analysis
First Claim
Patent Images
1. A method of compiling information for unique identification of one document from among a plurality of documents, the method comprising steps of:
- receiving a representation of the one document;
extracting minutiae data from the representation of the document, in accordance with defined identification criteria, sufficient to uniquely identify a hardcopy of the document;
collecting metadata regarding the representation of the document; and
storing the extracted minutiae data in association with the collected metadata, in a searchable database of data regarding the plurality of documents, wherein;
the extracted minutiae data comprise a plurality of features associated with text on the one document,the extracted minutiae data are not associated with human fingerprinting or a barcode and the extracted minutiae data were not added to the document specifically for the purpose of document identification,the minutiae data are selected from;
word count per page or per the entire document, tab spacing, indentation lengths, margin lengths, paragraph numbers, header, location, footer location, line numbers, line spacing, character spacing, font spacing, number of characters, textual color properties, text strings, text characters, white space total area data, specific text, specific phrases and specific numbers.
11 Assignments
0 Petitions
Accused Products
Abstract
The present application relates to a method, apparatus and programmable product for uniquely identifying a document. More specifically, the application allows for the identification of the document through collection of minutiae data at various points throughout the document'"'"'s lifecycle without reliance upon or requirement for any unique identification characters, barcodes and/or objects that were added to the document specifically for the purpose of identification.
-
Citations
25 Claims
-
1. A method of compiling information for unique identification of one document from among a plurality of documents, the method comprising steps of:
-
receiving a representation of the one document; extracting minutiae data from the representation of the document, in accordance with defined identification criteria, sufficient to uniquely identify a hardcopy of the document; collecting metadata regarding the representation of the document; and storing the extracted minutiae data in association with the collected metadata, in a searchable database of data regarding the plurality of documents, wherein; the extracted minutiae data comprise a plurality of features associated with text on the one document, the extracted minutiae data are not associated with human fingerprinting or a barcode and the extracted minutiae data were not added to the document specifically for the purpose of document identification, the minutiae data are selected from;
word count per page or per the entire document, tab spacing, indentation lengths, margin lengths, paragraph numbers, header, location, footer location, line numbers, line spacing, character spacing, font spacing, number of characters, textual color properties, text strings, text characters, white space total area data, specific text, specific phrases and specific numbers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of compiling information for recognition of a hardcopy of a document, the method comprising steps of:
-
collecting minutiae data of the hardcopy of the document, in accordance with defined identification criteria, sufficient to uniquely identify the hardcopy of the document, wherein the collected minutiae data was not added to the document specifically for the purpose of document identification; comparing the collected minutiae data of the hardcopy of the document to minutiae data for a plurality of identified documents in a database; and returning a result indicating whether or not the collected minutiae data matched minutiae data of any of the documents identified in the database, wherein the collected minutiae data comprises a plurality of features associated with text on the hardcopy document, and the collected minutiae data is not associated with human fingerprinting or a barcode. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method of compiling information for authenticating a hardcopy of a document, the method comprising steps of:
-
collecting both physical minutiae data regarding the hardcopy of the document and image minutiae data extracted from an image of the hardcopy of the document, in accordance with defined identification criteria, sufficient to uniquely identify the hardcopy of the document, wherein the physical minutiae data and image minutiae data were not added to the document specifically for the purpose of document identification; comparing the collected image and physical minutiae data of the hardcopy of the document to corresponding minutiae data for a plurality of identified documents in a database; and returning an authentication result indicating whether or not the collected minutiae data matched minutiae data of any of the documents identified in the database, wherein the collected image minutiae data comprises a plurality of features associated with text on the image of the hardcopy document, the collected image data is not associated with human fingerprinting or a barcode. - View Dependent Claims (21, 22, 23, 24, 25)
-
Specification