×

Merging three optical character recognition outputs for improved precision using a minimum edit distance function

  • US 5,459,739 A
  • Filed: 03/18/1992
  • Issued: 10/17/1995
  • Est. Priority Date: 03/18/1992
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for determining the content of an ancestral printed text from outputs of text to data conversion devices, each of said pages of printed text being made up of lines of symbols, and each of said outputs representing text symbols within strings-of-lines, said method comprising the steps of:

  • generating three separate ones of said outputs, A, B, and C, each of said three outputs being derived from a common one of said pages of said printed text, and each of said outputs being generated by an optical character recognition device;

    merging said outputs A, B, and C by,determining edit distances of the strings of lines, each line considered as a single unit, between the outputs A and B, A and C, and B and C,recovering an optimal alignment between outputs A, B, and C by searching for the minimum edit distance D(A,B,C) by backtracking through the determined edit distances, andproviding a derived output representing said content of said common one of said pages of printed text, the string-of-lines said derived output representing said optimal alignment; and

    recording said derived output on a memory medium.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×