Deep learning autoencoder approach: Automatic recognition of artistic Arabic calligraphy types

Authors

Keywords:

artistic Arabic calligraphy, autoencoder, deep learning, optical font recognition,

Abstract

Recognition of Arabic calligraphy types is a challenging problem. Difficulties include similarities among different types, overlap between letters, and letters that assume different shapes. In this study, a deep learning approach to recognizing artistic Arabic calligraphy types is presented. Autoencoder is a deep learning approach with the capability of reducing data dimensions in addition to extract features. Autoencoders could be stacked with several layers. The system is composed of three layers consisting of two encoder layers to extract features and a one soft max layer for the recognition stage. The font can be recognized in a collective manner based on the words or segments the exist in the font images. The input of the system consists of individual words or segment images that compose the font image, and the output is the recognized font type. The approach was evaluated on local and public datasets, and the achieved recognition rates were 92.1% and 99.5%, respectively.

Author Biography

Rami Al-Hmouz, King Abdulaziz University

Rami Al-Hmouz received the BSc degree in electrical engineering/telecommunication from Mutah University in 1998, the MSc degree in electrical engineering/communication from the University of Jordan in 2002, the MSc degree in computer engineering from the University of Western Sydney in 2004, and the PhD degree in computer engineering from the University of Technology, Sydney, in 2008. Currently, he is a professor at King Abdulaziz University in Saudi Arabia.

References

Allaf, S. Al-Hmouz, R. (2016). Automatic Recognition of Artistic Arabic Calligraphy. JKAU: Eng., Sci, 27: 3–17.

Avcı, D. (2016). A novel meaningful secret image sharing method based on Arabic letters. Kuwait J. Sci., 43: 114–124.

Bilal, B. Abdullah, S. Norul Huda, S. Khairudin, O. (2011). A Statistical Global Feature Extraction Method for Optical Font Recognition. N.T. Nguyen, C.-G. Kim, and A. Janiak (Eds.): ACIIDS 2011, LNAI 6591, 257–267.

Bilal, B. Abdullah, S. Norul Huda, S. Khairuddin, O. (2012). A novel statistical feature extraction method for textual images: optical font recognition. Expert Syst. Appl., 39: 5470–5477.

Chen, Z. Yeo, C. K. Lee, B. S. Lau, C. T. Jin, Y. (2018). Evolutionary multi-objective optimization-based ensemble autoencoders for image outlier detection. Neurocomputing, 309: 92-200.

Deng, J. Zhang, Z. Marchi, E. Schuller, B. (2013). Sparse autoencoder based feature transfer learning for speech emotion recognition. inProc. Hum. Assoc. Conf. Affect. Comput. Intell. Interact. (ACII), 511–516.

Deng, J. Zhang, Z. Eyben, F. Schuller, B. (2014). Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Process Letters, 21: 1068–1072.

Dizaji, K. G. Herandi, A. Deng, C. Cai, W. (2017). Huang H., Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. inProc. IEEE Int. Conf. Comput. Vis. (ICCV), 5747–5756.

Gao, S. Zhang, Y. Jia, K. Lu, J. Zhang, Y. (2015). Single sample face recognition via learning deep supervised autoencoders. IEEE Trans on Inform Foren Sec., 10: 2108–2118.

Grozdić, Đ. T. Jovičić, S. T. Subotić, M. (2017). Whispered speech recognition using deep denoising autoencoder. Engineering Applications of Artificial Intelligence, 59: 15-22.

Guellil, I. Saadane, H. Azouaou, F. Gueni, B. Nouvel, D. (2019). Arabic natural language processing: An overview. J King Saud Uni - Computer and Information Sciences, 1-11.

Gutub, A. Ghouti, L. Elarian, Y. Awaideh, S. Alvi, A. (2010). Utilizing diacritic marks for Arabic text. Kuwait J Sci Engin, 37: 89–109.

Hossein, K. Ehsanollah, K. (2101). Farsi font recognition based on Sobel–Roberts features. Pattern Recognition Letters, 31: 75–82.

Ibrahim, A. (2005). Arabic font recognition using decision trees built from common words. J. Comput. Inform. Technol, 13: 211–222.

Janowczyk, A. Basavanhally, A. Madabhushi, A. (2017). Stain Normalization using Sparse AutoEncoders (StaNoSA): Application to digital pathology. Computerized Medical Imaging and Graphics, 57: 50-61.

Lutf, M. Xinge, Y. Cheung, Y-M. Philip Chen, C.L. (2014). Arabic font recognition based on diacritics features. Pattern Recognition, 47(2): 72-684.

Moussa, B. Zahour, S. Kherallah, A. Benabdelhafid, M. Alimi, A. (2006). Utilisation de nouveaux paramètres à base de fractale pour la discrimination des fontes arabes. In: Laurence Likforman-Sulem (Ed.), Actes du 9ème Colloque International Francophone sur l’Ecrit et le Document, 283–288.

Moussa, B.S. Zahour, A. Benabdelhafid, A. Alimi, A.M. (2010). New features using fractal multi-dimensions for generalized Arabic font recognition. Pattern Recognition Letters, 31: 361–371.

Ranzato, M. Poultney, C. Chopra, S. Cun, Y. L. (2007). Efficient learning of sparse representations with an energy-based model. In Advances in Neural Information Processing Systems, 19: 1137–1144.

Rifai, S. Vincent, P. Müller, X. Glorot, X. Bengio, Y. (2011). Contractive auto-encoders: Explicit invariance during feature extraction. in Proc. 28th Int. Conf. Int. Conf. Mach. Learn. (ICML), 833–840.

Salakhutdinov, R. Hinton, G. (2012). An efficient learning procedure for deep Boltzmann machines. Neural Comput., 24: 1967–2006.

Slimane, F. Kanoun, S. Hennebert, J. Alimi, M. A. Ingold, R. (2013). A study on font-family and font-size recognition applied to Arabic word images at ultra-low resolution. Pattern Recognition Letters, 34: 209–218.

Vincent, P. Larochelle, H Lajoie, I. Bengio, Y. Manzagol, P. A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn., 11(12): 3371–3408.

Weitzel, C. (2004). The Written Word in Islamic Art, in The Buddha of Suburbia: Proceedings of the Eighth Australian and International Religion. Literature and the Arts Conference 2004. RLA Press, Sydney, Australia, 213–222.

Wu, C. Wu, F. Wu, S. Yuan, Z. Liu, J. Huang, Y. (2019). Semi-supervised dimensional sentiment analysis with variational autoencoder. Knowledge-Based Systems, 165: 30-39.

Xia, C. Qi, F. Shi, G. (2016). Bottom–up visual saliency estimation with deep autoencoder-based sparse reconstruction. IEEE Trans on Neural Net Learning Systems. 27: 1227–1240.

Yousfi, S. (2016). Embedded Arabic text detection and recognition in videos. Document and Text Processing. Université de Lyon. English. ⟨NNT : 2016LYSEI069⟩. ⟨tel-01406716v2⟩.

Zaghden, N. Moussa, B. Alimi, S. Adel, M. (2006). Reconnaissance des fontes arabes par l’utilisation des dimensions fractales et des ondelettes. In: Actes du 9ème Colloque International Francophone sur l’Ecrit et le Document, 277–282.

Zramdini, A. Rolf, I. (1998). Optical Font Recognition Using Typographical Features. IEEE Transaction on Pattern Analysis and Machine Intelligence, 20(4): 877 - 882.

Zhu, Y. Tan, T. Wang, Y. (2001). Font Recognition Based on Global Texture Analysis. IEEE Transaction on Pattern Analysis and Machine Intelligence, 23 (10): 1192–1200.

Downloads

Published

02-07-2020