×

Word spotting in bitmap images using context-sensitive character models without baselines

  • US 5,592,568 A
  • Filed: 02/13/1995
  • Issued: 01/07/1997
  • Est. Priority Date: 12/17/1992
  • Status: Expired due to Term
First Claim
Patent Images

1. For use in a processor-based method of determining whether a keyword is present in a bitmap input image wherein the input image is subjected to an operation that determines bounding boxes for portions of the input image that potentially contain text and an operation that generates a sequence of features for each such portion of the input image, the text being considered to extend horizontally, a method for modeling the appearance of keywords in the input image, the method comprising the steps of:

  • providing a set of previously-trained single-character hidden Markov models (HMMs);

    concatenating those single-character HMMs that correspond to the characters in the keyword so as to define a keyword HMM; and

    constructing an HMM network that includes the keyword HMM;

    said providing step being characterized in that;

    each character has a shape that is characterized at each of a plurality of horizontal locations along the character by at least one parameter representing a vertical slice of the character at the horizontal location,each character has a number of distinct portions, the number being greater than 1 for at least some characters,each distinct portion spans a respective contiguous subset of the plurality of horizontal locations,a given single-character HMM for a given character is characterized by a number of states, each state of which corresponds to a respective one of the number of distinct portions of the given character,each state is characterized by a statistical distribution of the at least one parameter for the corresponding distinct portion of the given character,each single-character HMM has a number of possible contexts depending on whether the character has an ascender or descender,for a given single-character HMM, a given state for a first context differs from the given state for a second context in a manner that reflects differences in at least one of (a) the size of the given character, and (b) the vertical position of the given character, such differences being between a situation when the given character is registered to a bounding box of a portion of the input image in the first context and a situation when the given character is registered to the bounding box of a portion of the input image in the second context, andthe single-character HMMs that are concatenated to form the keyword HMM have the same context.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×