×

Method for aligning a text image to a transcription of the image

  • US 5,689,585 A
  • Filed: 04/28/1995
  • Issued: 11/18/1997
  • Est. Priority Date: 04/28/1995
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of operating a machine to align image words in a text image to transcription words in a transcription associated with the text image;

  • the machine including a processor and a memory device for storing data;

    the data stored in the memory device including instruction data the processor executes to operate the machine;

    the processor being connected to the memory device for accessing the data stored therein;

    the method comprising;

    operating the processor to obtain an image definition data structure defining a text image including a plurality of glyphs representing characters in an image character set;

    operating the processor to obtain a transcription data structure associated with the text image and including a plurality of transcription labels indicating character codes representing characters in the image character set;

    operating the processor to produce an ordered image word sequence of image words occurring in the text image;

    each image word being an image region in the text image including at least one glyph;

    operating the processor to produce an ordered transcription word sequence of transcription words occurring in the transcription data structure;

    each transcription word being a sequence of at least one transcription label indicating a character code in the image character set; and

    operating the processor to perform an alignment operation to align image words in the ordered image word sequence with transcription words in the ordered transcription word sequence subject to a constraint of maintaining word order in each of the ordered transcription word sequence and the ordered image word sequence during alignment;

    the alignment operation producing an image-transcription alignment data structure indicating each image word in the ordered image word sequence paired with either no (a null) transcription word or with at most one transcription word in the ordered transcription word sequence.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×