Volume 12, Number 1, September 2016 - DOI: http://dx.doi.org/10.21700/ijcis.2016.273

IJCIS

Computing and Information Sciences is a peer reviewed journal that is committed to timely publication of original research, surveying and tutorial contributions on the analysis and development of computing and information science. The journal is designed mainly to serve researchers and developers, dealing with information and computing. Papers that can provide both theoretical analysis, along with carefully designed computational experiments, are particularly welcome. The journal is published 2-3 times per year with distribution to librarians, universities, research centers, researchers in computing, mathematics, and information science. The journal maintains strict refereeing procedures through its editorial policies in order to publish papers of only the highest quality. The refereeing is done by anonymous Reviewers. Often, reviews take four months to six months to obtain, occasionally longer, and it takes an additional several months for the publication process.

DOI: http://dx.doi.org/10.21700/ijcis.2016.273

Automatic Script and Type Identification in Bi-lingual Forms

Nabil Aouadi
Afef Kacem Echi1* email: afef.kacem@esstt.rnu.tn 

LaTICE Laboratory, University of Tunis Avenue Taha Hussein Montfleury, 1008 Tunis, Tunisia

*Corresponding author.

Received: 10 June 2016
Revised: 25 June 2016
Accepted: 31 August 2016
Published: 29 September 2016

Abstract: In this paper we have developed a system that can automatically discriminate between machine-printed and handwritten words in structured bi-lingual (Arabic and French) form document layout. Our system has been applied in the context of Tunisian National Health Insurance Fund for medical care costs refund with encouraging results. In the used forms, handwritten data usually touch or cross the preprinted form frames and texts, creating complex problems for the recognition routines. Each text type should also be processed using different methods in order to optimize the recognition accuracy. This work aims to address these issues and to especially solve the problem of machine-printed/handwritten and Arabic/French word discrimination. To this end, we computed co-occurrence matrix of oriented gradients from word's image and used it as input to a k-Nearest Neighbor classifier. Experiments are carried on 20 forms. An average script identification rate of 98.31% is achieved.

Keywords: Bi-lingual forms; Word script and type identification; Text-line Segmentation; Word Extraction; Co-occurrence Matrix of Oriented Gradients; Classification.


  • PDF (413 KB)
  • ZIP (384 KB)


  •  

    Contacts

    Editor-in-Chief
    Prof. Jihad Mohamad Alja'am 
    Email: editor.ijcis@gmail.com 

    The Journal Secretary
    Eng. Dana Bandok
    Ontario, Canada 
    Email: sec.ijcis@gmail.com 

    Home Page »