×

DOCUMENT IMAGE SEGMENTATION SYSTEM

  • US 20100290701A1
  • Filed: 04/01/2010
  • Published: 11/18/2010
  • Est. Priority Date: 05/13/2009
  • Status: Active Grant
First Claim
Patent Images

1. A system for document image segmentation, said system comprising:

  • input means adapted to input a document image;

    image pre-processing means adapted to pre-process said document image by maintaining the aspect ratio, said pre-processing means including a colour quantization means to give a pre-processed quantized image;

    colour space transformation means adapted to receive said pre-processed quantized image and apply a Hue, Saturation and Value colour space transformation on said quantized image to derive a transformed image containing only saturation component of said quantized image;

    first image energy calculation means adapted to receive said transformed image and calculate both horizontal and vertical energies of said transformed image to provide a first energy image by cumulating both of said calculated energies of said transformed image;

    grayscale image conversion means adapted to receive said pre-processed quantized image and perform a grayscale conversion operation on said quantized image to provide a gray scale image;

    second image energy calculation means adapted to receive said gray scale image and calculate both horizontal and vertical energies of said gray scale image to provide a second energy image by cumulating both of said calculated energies and said gray scale image;

    computational means adapted to receive said first energy image and second energy image to compute a maximum of both the energies and provide a maximum energy image;

    binarization means adapted to receive said maximum energy image and provide a binarized image;

    dilation means adapted to receive said binarized image and perform a dilation operation to provide a dilated image;

    clustering means adapted to receive said dilated image and formulate different clusters based on the density of the dilated areas and provide a clustered image; and

    box creation means adapted to create bounding boxes enclosing each cluster in the clustered image to form an image of the document having image segments.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×