×

Document matching engine using asymmetric signature generation

  • US 7,860,853 B2
  • Filed: 02/11/2008
  • Issued: 12/28/2010
  • Est. Priority Date: 02/14/2007
  • Status: Active Grant
First Claim
Patent Images

1. An automated method of document matching using asymmetrical signature generation, the method comprising:

  • receiving documents from a document repository;

    generating signatures for each of the documents using a first signature generator;

    providing the signatures and a document identifier for each of the documents to a signature database;

    receiving an input document;

    generating signatures for the input document using a second signature generator; and

    searching the signature database using the signatures generated for the input document,wherein the first and second signature generators are configured such that different numbers of signatures are generated for a same document, andwherein the first and second signature generators each;

    receive a document comprising a plurality of characters;

    normalize the document to remove non-informative characters from the plurality of characters;

    calculate a score for each informative character of the plurality of characters based on an occurrence frequency and distribution in the document;

    rank each informative character of the plurality of characters based on the calculated score;

    select, from the ranked informative characters, character occurrences; and

    generate a signature for each selected character occurrence.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×