×

Automatic method of identifying sentence boundaries in a document image

  • US 5,892,842 A
  • Filed: 12/14/1995
  • Issued: 04/06/1999
  • Est. Priority Date: 12/14/1995
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of identifying sentence boundaries within a document image without performing character recognition, the document image including a multiplicity of connected components, each connected component having a bounding box, the method being implemented by a processor coupled to a memory storing instructions representing the method, the method comprising the steps of:

  • a) selecting a connected component from the multiplicity of connected components;

    b) determining whether the selected connected component might represent a period based upon a shape of the selected connected component;

    c) determining whether the selected connected component might represent a colon; and

    d) labeling the selected connected component as a sentence boundary if the selected connected component might be a period and is not part of a colon.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×