Apparatus, method and programmable product for identification of a document with feature analysis

US 8,520,888 B2
Filed: 04/25/2008
Issued: 08/27/2013
Est. Priority Date: 04/26/2007
Status: Active Grant

First Claim

Patent Images

1. A method of compiling information for unique identification of one document from among a plurality of documents, the method comprising steps of:

receiving a representation of the one document;

extracting minutiae data from the representation of the document, in accordance with defined identification criteria, sufficient to uniquely identify a hardcopy of the document;

collecting metadata regarding the representation of the document; and

storing the extracted minutiae data in association with the collected metadata, in a searchable database of data regarding the plurality of documents, wherein;

the extracted minutiae data comprise a plurality of features associated with text on the one document,the extracted minutiae data are not associated with human fingerprinting or a barcode and the extracted minutiae data were not added to the document specifically for the purpose of document identification,the minutiae data are selected from;

word count per page or per the entire document, tab spacing, indentation lengths, margin lengths, paragraph numbers, header, location, footer location, line numbers, line spacing, character spacing, font spacing, number of characters, textual color properties, text strings, text characters, white space total area data, specific text, specific phrases and specific numbers.

View all claims

11 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present application relates to a method, apparatus and programmable product for uniquely identifying a document. More specifically, the application allows for the identification of the document through collection of minutiae data at various points throughout the document'"'"'s lifecycle without reliance upon or requirement for any unique identification characters, barcodes and/or objects that were added to the document specifically for the purpose of identification.

Citations

25 Claims

1. A method of compiling information for unique identification of one document from among a plurality of documents, the method comprising steps of:
- receiving a representation of the one document;
  
  extracting minutiae data from the representation of the document, in accordance with defined identification criteria, sufficient to uniquely identify a hardcopy of the document;
  
  collecting metadata regarding the representation of the document; and
  
  storing the extracted minutiae data in association with the collected metadata, in a searchable database of data regarding the plurality of documents, wherein;
  
  the extracted minutiae data comprise a plurality of features associated with text on the one document,the extracted minutiae data are not associated with human fingerprinting or a barcode and the extracted minutiae data were not added to the document specifically for the purpose of document identification,the minutiae data are selected from;
  
  word count per page or per the entire document, tab spacing, indentation lengths, margin lengths, paragraph numbers, header, location, footer location, line numbers, line spacing, character spacing, font spacing, number of characters, textual color properties, text strings, text characters, white space total area data, specific text, specific phrases and specific numbers.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein:
    - the received representation of the one document is a pre-print electronic representation of the document; and
      
      the step of extracting minutiae data comprising a plurality of features associated with text on the one pre-print electronic representation of the document, obtaining electronic minutiae data from the pre-print electronic representation of the document.
  - 3. The method of claim 1, wherein:
    - the received representation of the one document comprises an electronic image of a hardcopy of the document; and
      
      the step of extracting minutiae data comprises obtaining hardcopy minutiae data from the hardcopy of the document.
  - 4. The method of claim 3, further comprising:
    - obtaining physical characteristic minutiae data of the image of the hardcopy of the document; and
      
      including the physical characteristic minutiae data in the extracted minutiae data for the document stored in the database.
  - 5. The method of claim 4, wherein the physical characteristics minutiae data comprise properties relating to one or more aspects of the one document selected from the group consisting of chemical, radio frequency, magnetic or microscopic properties of the document.
  - 6. The method of claim 1, wherein the minutiae data is selected from the group consisting of:
    - positioning of text on a page of the one document; and
      
      defined text contained in the one document.
  - 7. A non-transitory computer readable medium embodying a program, wherein execution of the program causes a computer to implement the method of claim 1.
  - 8. A system configured to implement the steps of the method of claim 1.

9. A method of compiling information for recognition of a hardcopy of a document, the method comprising steps of:
- collecting minutiae data of the hardcopy of the document, in accordance with defined identification criteria, sufficient to uniquely identify the hardcopy of the document, wherein the collected minutiae data was not added to the document specifically for the purpose of document identification;
  
  comparing the collected minutiae data of the hardcopy of the document to minutiae data for a plurality of identified documents in a database; and
  
  returning a result indicating whether or not the collected minutiae data matched minutiae data of any of the documents identified in the database,wherein the collected minutiae data comprises a plurality of features associated with text on the hardcopy document, andthe collected minutiae data is not associated with human fingerprinting or a barcode.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 10. The method of claim 9, wherein:
    - the collecting step comprises extracting the minutiae data of the hardcopy of the document from an image taken of the hardcopy of the document; and
      
      the comparing step comprises comparing the minutiae data extracted from the image to corresponding minutiae data of the documents identified in the database.
  - 11. The method of claim 10, wherein:
    - the collecting step further comprises obtaining physical minutiae data regarding the hardcopy of the document; and
      
      the comparing step further comprises comparing the physical minutiae data regarding the hardcopy of the document to physical minutiae data regarding the documents identified in the database.
  - 12. The method of claim 9, wherein:
    - the collecting step comprises obtaining physical minutiae data regarding the hardcopy of the document; and
      
      the comparing step comprises comparing the physical minutiae data regarding the hardcopy of the document to physical minutiae data regarding the documents identified in the database.
  - 13. The method of claim 9, wherein upon the result indicating a match of the collected minutiae data to minutiae data of one of the documents identified in the database, the method further comprises:
    - displaying metadata associated with the image of the hardcopy of the document or metadata associated with the one document identified in the database.
  - 14. The method of claim 9, wherein upon the result indicating a match of the collected minutiae data to minutiae data of one of the documents identified in the database, the method further comprises:
    - comparing the metadata associated with the image of the hardcopy of the document to metadata associated with the one document identified in the database, andwhen the metadatas differ, updating the metadata associated with the one document identified in the database in accordance with the metadata associated with the image of the hardcopy of the document.
  - 15. The method of claim 9, further comprising controlling at least one operation of processing of the hardcopy of the document responsive to the result.
  - 16. The method of claim 9, wherein the collected minutiae data comprise properties relating to one or more aspects of the hardcopy of the document selected from the group consisting of chemical, radio frequency, magnetic or microscopic properties of the hardcopy of the document.
  - 17. A non-transitory computer readable medium embodying a program, wherein execution of the program causes a computer to implement the method of claim 9.
  - 18. A system configured to implement the steps of the method of claim 9.
  - 19. The method of claim 9, wherein the minutiae data are selected from:
    - word count per page or per the entire document, tab spacing, indentation lengths, margin lengths, paragraph numbers, header, location, footer location, line numbers, line spacing, character spacing, font spacing, number of characters, textual color properties, text strings, text characters, white space total area data, specific text, specific phrases and specific numbers.

20. A method of compiling information for authenticating a hardcopy of a document, the method comprising steps of:
- collecting both physical minutiae data regarding the hardcopy of the document and image minutiae data extracted from an image of the hardcopy of the document, in accordance with defined identification criteria, sufficient to uniquely identify the hardcopy of the document, wherein the physical minutiae data and image minutiae data were not added to the document specifically for the purpose of document identification;
  
  comparing the collected image and physical minutiae data of the hardcopy of the document to corresponding minutiae data for a plurality of identified documents in a database; and
  
  returning an authentication result indicating whether or not the collected minutiae data matched minutiae data of any of the documents identified in the database,wherein the collected image minutiae data comprises a plurality of features associated with text on the image of the hardcopy document,the collected image data is not associated with human fingerprinting or a barcode.
- View Dependent Claims (21, 22, 23, 24, 25)
- - 21. The method of claim 20, wherein the image minutiae data is selected from the group consisting of:
    - font of text used on the document;
      
      positioning of text on a page of the document; and
      
      defined text contained in the document.
  - 22. The method of claim 20, wherein the physical minutiae data includes properties relating to one or more aspects of the hard copy of the document selected from the group consisting of chemical, radio frequency, magnetic or microscopic properties.
  - 23. The method of claim 20, wherein upon the authentication result indicating a match of the collected minutiae data to minutiae data of one of the documents identified in the database, the method further comprises:
    - displaying metadata associated with the image of the hardcopy of the document or metadata associated with the document identified in the database.
  - 24. The method of claim 20, further comprising controlling at least one operation of processing of the hardcopy of the document responsive to the authentication result.
  - 25. The method of claim 20, wherein the image minutiae data are selected from:
    - word count per page or per the entire document, tab spacing, indentation lengths, margin lengths, paragraph numbers, header, location, footer location, line numbers, line spacing, character spacing, font spacing, number of characters, textual color properties, text strings, text characters, white space total area data, specific text, specific phrases and specific numbers.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Bell & Howell LLC (Böwe Systec AG)
Original Assignee
Bell & Howell LLC (Böwe Systec AG)
Inventors
Spitzig, Roger, Conard, Walter S., Phifer, Leondo R.
Primary Examiner(s)
GORADIA, SHEFALI DINESH

Application Number

US12/149,024
Publication Number

US 20100027834A1
Time in Patent Office

1,950 Days
Field of Search

382/100, 382/218, 382/115, 382/116
US Class Current

382/100
CPC Class Codes

G06V 20/80   Recognising image objects c...

G06V 30/10   Character recognition

G06V 30/186   by deriving mathematical or...

G07B 17/00467   Transporting mailpieces pos...

G07B 2017/00443   Verification of mailpieces,...

G07B 2017/00491   Mail/envelope/insert handli...

Apparatus, method and programmable product for identification of a document with feature analysis

First Claim

11 Assignments

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Apparatus, method and programmable product for identification of a document with feature analysis

First Claim

11 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links