Word spotting in bitmap images using context-sensitive character models without baselines

US 5,592,568 A
Filed: 02/13/1995
Issued: 01/07/1997
Est. Priority Date: 12/17/1992
Status: Expired due to Term

First Claim

Patent Images

1. For use in a processor-based method of determining whether a keyword is present in a bitmap input image wherein the input image is subjected to an operation that determines bounding boxes for portions of the input image that potentially contain text and an operation that generates a sequence of features for each such portion of the input image, the text being considered to extend horizontally, a method for modeling the appearance of keywords in the input image, the method comprising the steps of:

providing a set of previously-trained single-character hidden Markov models (HMMs);

concatenating those single-character HMMs that correspond to the characters in the keyword so as to define a keyword HMM; and

constructing an HMM network that includes the keyword HMM;

said providing step being characterized in that;

each character has a shape that is characterized at each of a plurality of horizontal locations along the character by at least one parameter representing a vertical slice of the character at the horizontal location,each character has a number of distinct portions, the number being greater than 1 for at least some characters,each distinct portion spans a respective contiguous subset of the plurality of horizontal locations,a given single-character HMM for a given character is characterized by a number of states, each state of which corresponds to a respective one of the number of distinct portions of the given character,each state is characterized by a statistical distribution of the at least one parameter for the corresponding distinct portion of the given character,each single-character HMM has a number of possible contexts depending on whether the character has an ascender or descender,for a given single-character HMM, a given state for a first context differs from the given state for a second context in a manner that reflects differences in at least one of (a) the size of the given character, and (b) the vertical position of the given character, such differences being between a situation when the given character is registered to a bounding box of a portion of the input image in the first context and a situation when the given character is registered to the bounding box of a portion of the input image in the second context, andthe single-character HMMs that are concatenated to form the keyword HMM have the same context.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Font-independent spotting of user-defined keywords in a scanned image. Word identification is based on features of the entire word without the need for segmentation or OCR, and without the need to recognize non-keywords. Font-independent character models are created using hidden Markov models (HMMs) and arbitrary keyword models are built from the character HMM components. Word or text line bounding boxes are extracted from the image, a set of features based on the word shape, (and preferably also the word internal structure) within each bounding box is extracted, this set of features is applied to a network that includes one or more keyword HMMs, and a determination is made. The identification of word bounding boxes for potential keywords includes the steps of reducing the image (say by 2x) and subjecting the reduced image to vertical and horizontal morphological closing operations. The bounding boxes of connected components in the resulting image are then used to hypothesize word or text line bounding boxes, and the original bitmaps within the boxes are used to hypothesize words. In a particular embodiment, a range of structuring elements is used for the closing operations to accommodate the variation of inter- and intra-character spacing with font and font size.

45 Citations

View as Search Results

18 Claims

1. For use in a processor-based method of determining whether a keyword is present in a bitmap input image wherein the input image is subjected to an operation that determines bounding boxes for portions of the input image that potentially contain text and an operation that generates a sequence of features for each such portion of the input image, the text being considered to extend horizontally, a method for modeling the appearance of keywords in the input image, the method comprising the steps of:
- providing a set of previously-trained single-character hidden Markov models (HMMs);
  
  concatenating those single-character HMMs that correspond to the characters in the keyword so as to define a keyword HMM; and
  
  constructing an HMM network that includes the keyword HMM;
  
  said providing step being characterized in that;
  
  each character has a shape that is characterized at each of a plurality of horizontal locations along the character by at least one parameter representing a vertical slice of the character at the horizontal location,each character has a number of distinct portions, the number being greater than 1 for at least some characters,each distinct portion spans a respective contiguous subset of the plurality of horizontal locations,a given single-character HMM for a given character is characterized by a number of states, each state of which corresponds to a respective one of the number of distinct portions of the given character,each state is characterized by a statistical distribution of the at least one parameter for the corresponding distinct portion of the given character,each single-character HMM has a number of possible contexts depending on whether the character has an ascender or descender,for a given single-character HMM, a given state for a first context differs from the given state for a second context in a manner that reflects differences in at least one of (a) the size of the given character, and (b) the vertical position of the given character, such differences being between a situation when the given character is registered to a bounding box of a portion of the input image in the first context and a situation when the given character is registered to the bounding box of a portion of the input image in the second context, andthe single-character HMMs that are concatenated to form the keyword HMM have the same context.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein each of the single-character HMMs includes a final state to model optional intercharacter space.
  - 3. The method of claim 1 wherein the at least one parameter at a given horizontal location represents an upper and a lower character boundary at the given horizontal location.
  - 4. The method of claim 3 wherein the at least one parameter at a given horizontal location further represents an internal character structure.
  - 5. The method of claim 4 wherein the representation of internal character structure includes a set of autocorrelation values.
  - 6. The method of claim 4 wherein the representation of internal character structure includes the number of pixel transitions.
  - 7. The method of claim 1 and further comprising the step of providing the HMM network with a non-keyword HMM that models non-keywords.
  - 8. The method of claim 1 wherein the single-character HMMs were trained on a corpus containing bitmap images of text in a plurality of fonts.
  - 9. The method of claim 1 wherein:
    - the portion of the input image within a given bounding box is potentially a word; and
      
      the single-character HMMs that are concatenated to form the keyword HMM all have the context determined by whether the keyword contains ascenders or descenders.
  - 10. The method of claim 1 wherein the portion of the input image within a given bounding box is potentially a text line containing multiple words, and further including:
    - the step of incorporating the keyword HMM, referred to below as the first keyword HMM, into a network that includes a second keyword HMM for the same keyword whereinthe first keyword HMM has single-character HMMs that all have a first context determined by whether the keyword contains ascenders or descenders, andthe second keyword HMM has single-character HMMs that all have a second context, different from that of the single-character HMMs, in the first keyword HMM so as to accommodate the possibility that the keyword will be in the second context when embedded in the text line.

11. For use in a keyword-spotting network that includes at least one keyword hidden Markov model (HMM), a method of providing a context-insensitive, character based, non-keyword HMM comprising:
- providing first, second, third, and fourth sets of character HMMs, each set modeling a set of characters in a particular context;
  
  connecting the character HMMs from the first, second, third, and fourth sets of character HMMs in parallel between null states with a return loop whereby the optimal path for a non-keyword through the HMM is not constrained by the context of the characters in the non-keyword.

12. For use in a keyword-spotting network that includes at least one keyword hidden Markov model (HMM), a method of providing a context-sensitive, character-based, non-keyword HMM comprising:
- providing first, second, third, and fourth sets of character HMMs, each set modeling a set of characters in a particular context;
  
  constructing respective single-context character set HMMs, each having a respective set of character HMMs for the respective context connected in parallel between null states with a return loop; and
  
  connecting the single-context character set HMMs in parallel between null states with no return loops whereby the optimal path for a non-keyword through the HMM is constrained by the context of the characters in the non-keyword.

13. For use in a keyword-spotting network that includes at least one keyword hidden Markov model (HMM), a method of providing a context-insensitive, image-slice based, non-keyword HMM comprising:
- providing a set of image-slice states wherein the image-slice states model vertical slices of a portion of the input image without regard to the context of characters in that portion of the input image; and
  
  connecting the set of image-slice states in parallel between null states with a return loop whereby the optimal path for a non-keyword through the HMM is not constrained by the context of the characters in the non-keyword.

14. For use in a keyword-spotting network that includes at least one keyword hidden Markov model (HMM), a method of providing a context-sensitive, image-slice based, non-keyword HMM comprising:
- providing first, second, third, and fourth sets of image-slice states corresponding to first, second, third, and fourth contexts, wherein the image-slice states for a given set model vertical slices of character images in the particular context for that set;
  
  constructing respective single-context image-slice set HMMs, each having the image-slice states for the respective context connected in parallel between null states with a return loop; and
  
  connecting the single-context image-slice set HMMs in parallel between null states with no return loops whereby the optimal path for a non-keyword through the HMM is constrained by the context of the characters in the non-keyword.

15. For use in a processor-based method of determining whether a keyword is present in a bitmap input image wherein the input image is subjected to an operation that determines bounding boxes for portions of the input image that potentially contain text and an operation that generates a sequence of features for each such portion of the input image, the text being considered to extend horizontally, a method for modeling the appearance of keywords in the input image, the method comprising the steps of:
- providing a set of previously-trained single-character hidden Markov models (HMMs);
  
  concatenating those single-character HMMs that correspond to the characters in the keyword so as to define a keyword HMM; and
  
  constructing an HMM network that includes the keyword HMM;
  
  said providing step being characterized in that;
  
  each character has a shape that is characterized by a number of feature vectors at a corresponding number of horizontal locations along the character,a given single-character HMM for a given character is characterized by a number of states, each state of which corresponds to a respective one of the number of distinct portions of the given character,each character has a number of distinct portions, the number being greater than 1 for at least some characters,each distinct portion spans a respective contiguous subset of the plurality of horizontal locations,each state is characterized by a statistical distribution of the feature vectors that characterize the corresponding distinct portion of the given character,each single-character HMM has a number of possible contexts depending on whether the character has an ascender or descender,the single-character HMMS that are concatenated to form the keyword HMM have the same context, andthe feature vector for a given horizontal location in a given character includes a representation of an upper and lower character boundary referenced to a bounding box for the keyword.
- View Dependent Claims (16, 17, 18)
- - 16. The method of claim 15 wherein the single-character HMMs were trained on a corpus containing bitmap images of text in a plurality of fonts.
  - 17. The method of claim 15 wherein:
    - the portion of the input image within a given bounding box is potentially a word; and
      
      the single-character HMMs that are concatenated to form the keyword HMM all have the context determined by whether the keyword contains ascenders or descenders.
  - 18. The method of claim 15 wherein the portion of the input image within a given bounding box is potentially a text line containing multiple words, and further including:
    - the step of incorporating the keyword HMM, referred to below as the first keyword HMM, into a network that includes a second keyword HMM for the same keyword whereinthe first keyword HMM has single-character HMMs that all have a first context determined by whether the keyword contains ascenders or descenders, andthe second keyword HMM has single-character HMMs that all have a second context, different from that of the single-character HMMs in the first keyword HMM, so as to accommodate the possibility that the keyword will be in the second context when embedded in the text line.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Xerox Corporation (Xerox Holdings Corp.)
Original Assignee
Xerox Corporation (Xerox Holdings Corp.)
Inventors
Chen, Francine R., Wilcox, Lynn D.
Primary Examiner(s)
Mancuso, Joseph
Assistant Examiner(s)
DEL ROSSO, GERARD DOMNICK

Application Number

US08/387,958
Time in Patent Office

694 Days
Field of Search

382/195, 382/196, 382/173, 382/174, 382/204, 382/254, 382/256-258, 382/155, 382/224, 382/226
US Class Current

382/218
CPC Class Codes

G06F 18/295   Markov models or related mo...

G06V 30/10   Character recognition

G06V 30/146   Aligning or centring of the...

G06V 30/19187   Graphical models, e.g. Baye...

G06V 30/262   using context analysis, e.g...

Word spotting in bitmap images using context-sensitive character models without baselines

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

45 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Word spotting in bitmap images using context-sensitive character models without baselines

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

45 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links