×

System and methods for arabic text recognition based on effective arabic text feature extraction

  • US 8,908,961 B2
  • Filed: 04/23/2014
  • Issued: 12/09/2014
  • Est. Priority Date: 04/27/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for automatically recognizing Arabic text, comprising:

  • acquiring a text image comprising one or more Arabic words each including one or more Arabic characters;

    identify a plurality of lines of Arabic text in the text image;

    segmenting one of the plurality of lines of Arabic text into Arabic words;

    digitizing at least one of the Arabic words to form a two-dimensional array of pixels each associated with a pixel value, wherein the pixel value is expressed in a binary number;

    dividing the one of the Arabic words into a plurality of line images;

    defining a plurality of cells in one of the plurality of line images, wherein each of the plurality of cells comprises a group of adjacent pixels;

    serializing pixel values of pixels in each of the plurality of cells in one of the plurality of line images to form a binary cell number;

    forming a text feature vector according to binary cell numbers obtained from the plurality of cells in one of the plurality of line images; and

    feeding the text feature vector into a Hidden Markov Model to recognize the one or more Arabic words including the Arabic characters.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×