Utilization of Convolutional Neural Network for Effective Recognition of Complex and Common Facial Emotions

Ammar Ibrahim Majeed; Suhad Qasim Naeem; Elaf A. Saeed

doi:10.18196/jrc.v6i3.25804

Authors

Ammar Ibrahim Majeed Al-Nahrain University
Suhad Qasim Naeem Al-Nahrain University
Elaf A. Saeed Al-Nahrain University

DOI:

https://doi.org/10.18196/jrc.v6i3.25804

Keywords:

Expression Recognition, Convolutional Neural Network, Deep Learning, Confusion Matrix Analysis, Emotion Variability

Abstract

Facial expression recognition is an important area of computer vision used for human-computer interaction. The convolutional neural network model in this work was tested on the Fer-2013 dataset, and the experimental results demonstrated the superiority of the recognition rate. It is known that the Fer-2013 dataset contains data collected in an experimental environment, and to verify the generalization capability of model recognition, a self-made facial expression data set in a natural state was created, and the models are trained using this dataset to identify emotions from face photos, however, it has biases and limitations, including poor resolution (48 x 48 pixels) and class imbalance, which causes some emotions to be overrepresented. Additionally, it is devoid of demographic data, which may cause some groups to do poorly, furthermore, even though emotions are frequently mixed and context-dependent, it assumes that they are entirely distinct. More varied datasets, better class balance, the addition of demographic data, context, and sophisticated deep learning might all be employed to boost performance. also performed a series of pre-processing on the face images such as cropping, and pixel adjustment. The cropping is used to increase processing efficiency by removing extraneous portions of the image to highlight the crucial area. Normalization and contrast enhancement are examples of pixel manipulation that improves analysis and make the image more readable. The expression recognition results indicate that the model achieved an overall accuracy rate of 85.10% on the self-made natural expression dataset. Recognition accuracy was high for happy, neutral, and surprised expressions, while it was lower for disgust and fear expressions due to their variability and similarity in features. Because they have recognizable facial traits that are simple for models to identify—such as a grin for happiness or an open mouth for surprise—they are more accurate at identifying emotions of pleasure, neutrality, and surprise. On the other hand, the model's accuracy is lower for disgust and fear expressions since some of their characteristics are comparable to those of other emotions (for example, the resemblance between the expressions of fear and surprise) and differ from person to person, making it challenging to tell them apart. The confusion matrix highlights that fear expressions were often misidentified as a surprise, primarily due to pupil dilation in both expressions. The study concludes that the developed pre-training CNN model effectively recognizes facial expressions, demonstrating significant accuracy, particularly with certain emotions. Future work may focus on improving recognition rates for less distinct expressions and expanding the dataset for better generalization.

References

R. Vempati and L. D. Sharma, “A systematic review on automated human emotion recognition using electroencephalogram signals and artificial intelligence,” Results in Engineering, vol. 18, p. 101027, 2023.

M. K. Chowdary, T. N. Nguyen, and D. J. Hemanth, “Deep learning-based facial emotion recognition for human–computer interaction applications,” Neural Comput Appl, vol. 35, no. 32, pp. 23311–23328, 2023.

R. Aliradi and A. Ouamane, “A novel descriptor (LGBQ) based on Gabor filters,” Multimed Tools Appl, vol. 83, no. 4, pp. 11669–11686, 2024.

U. A. Bhatti et al., “Local similarity-based spatial–spectral fusion hyperspectral image classification with deep CNN and Gabor filtering,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–15, 2021.

A. Kharat, P. Garje, and R. Wantmure, “Local approaches in face recognition: a case study using histogram of oriented gradients (HOG) technique,” NCRD’s Technical Review: e-Journal, vol. 8, no. 1, pp. 1–12, 2023.

M. Sheriff, S. Jalaja, K. T. Dinesh, J. Pavithra, Y. Puja, and M. Sowmiya, “Face emotion recognition using histogram of oriented gradient (hog) feature extraction and neural networks,” in Recent Trends in Computational Intelligence and Its Application, pp. 114–122, 2023.

N. R. Pradeep and J. Ravi, “An Accurate Fingerprint Recognition Algorithm based on Histogram Oriented Gradient (HOG) Feature Extractor,” International Journal of Electrical Engineering & Technology (IJEET), vol. 12, no. 2, pp. 19–25, 2021.

L. Mao and L. Tang, “Pedestrian detection based on gradient direction histogram,” in 2022 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC), pp. 939–943, 2022.

W. Burger and M. J. Burge, “The discrete cosine transform (DCT),” in Digital Image Processing: An Algorithmic Introduction, pp. 589–597, 2022.

C. Scribano, G. Franchini, M. Prato, and M. Bertogna, “DCT-former: Efficient self-attention with discrete cosine transform,” J Sci Comput, vol. 94, no. 3, p. 67, 2023.

X. Shen et al., “Dct-mask: Discrete cosine transform mask representation for instance segmentation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8720–8729, 2021.

I. F. Ince, F. Bulut, I. Kilic, M. E. Yildirim, and O. F. Ince, “Low dynamic range discrete cosine transform (LDR-DCT) for high-performance JPEG image compression,” Vis Comput, vol. 38, no. 5, pp. 1845–1870, 2022.

W. Burger and M. J. Burge, “Scale-invariant feature transform (SIFT),” in Digital Image Processing: An Algorithmic Introduction, pp. 709–763, 2022.

M. A. Taha, H. M. Ahmed, and S. O. Husain, “Iris Features Extraction and Recognition based on the Scale Invariant Feature Transform (SIFT),” Webology, vol. 19, no. 1, pp. 171–184, 2022.

P. Podder, M. R. H. Mondal, and J. Kamruzzaman, “Iris feature extraction using three-level Haar wavelet transform and modified local binary pattern,” in Applications of Computational Intelligence in Multi-Disciplinary Research, pp. 1-15, 2022.

T. A. Al-Shurbaji, K. A. AlKaabneh, I. Alhadid, and R. Masa’deh, “An optimized scale-invariant feature transform using chamfer distance in image matching,” Intelligent Automation & Soft Computing, vol. 31, no. 2, pp. 971–985, 2022.

S. Goyal, “Effective software defect prediction using support vector machines (SVMs),” International Journal of System Assurance Engineering and Management, vol. 13, no. 2, pp. 681–696, 2022.

M. A. Chandra and S. S. Bedi, “Survey on SVM and their application in image classification,” International Journal of Information Technology, vol. 13, no. 5, pp. 1–11, 2021.

A. Kurani, P. Doshi, A. Vakharia, and M. Shah, “A comprehensive comparative study of artificial neural network (ANN) and support vector machines (SVM) on stock forecasting,” Annals of Data Science, vol. 10, no. 1, pp. 183–208, 2023.

Z. H. Kok, A. R. M. Shariff, M. S. M. Alfatni, and S. Khairunniza-Bejo, “Support vector machine in precision agriculture: a review,” Comput Electron Agric, vol. 191, p. 106546, 2021.

A. R. Zhang, T. T. Cai, and Y. Wu, “Heteroskedastic PCA: Algorithm, optimality, and applications,” The Annals of Statistics, vol. 50, no. 1, pp. 53–80, 2022.

J. Liu et al., “A spatial distribution–Principal component analysis (SD-PCA) model to assess pollution of heavy metals in soil,” Science of The Total Environment, vol. 859, p. 160112, 2023.

D. Huang, F. Jiang, K. Li, G. Tong, and G. Zhou, “Scaled PCA: A new approach to dimension reduction,” Manage Sci, vol. 68, no. 3, pp. 1678–1695, 2022.

L. C. Lee and A. A. Jemain, “On overview of PCA application strategy in processing high dimensionality forensic data,” Microchemical Journal, vol. 169, p. 106608, 2021.

R. Szeliski, Computer vision: algorithms and applications. Springer Nature, 2022.

L. Zhou, L. Zhang, and N. Konz, “Computer vision techniques in manufacturing,” IEEE Trans Syst Man Cybern Syst, vol. 53, no. 1, pp. 105–117, 2022.

L. Yuan et al., “Florence: A new foundation model for computer vision,” arXiv preprint arXiv:2111.11432, 2021.

M. Goldblum et al., “Battle of the backbones: A large-scale comparison of pretrained models across computer vision tasks,” Adv Neural Inf Process Syst, vol. 36, 2024.

S. R. Shah, S. Qadri, H. Bibi, S. M. W. Shah, M. I. Sharif, and F. Marinello, “Comparing inception V3, VGG 16, VGG 19, CNN, and ResNet 50: a case study on early detection of a rice disease,” Agronomy, vol. 13, no. 6, p. 1633, 2023.

G. S. Nugraha, M. I. Darmawan, and R. Dwiyansaputra, “Comparison of CNN’s Architecture GoogleNet, AlexNet, VGG-16, Lenet-5, Resnet-50 in Arabic Handwriting Pattern Recognition,” Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 2023.

A. Bagaskara and M. Suryanegara, “Evaluation of VGG-16 and VGG-19 deep learning architecture for classifying dementia people,” in 2021 4th International Conference of Computer and Informatics Engineering (IC2IE), pp. 1–4, 2021.

X. Zhang, “The AlexNet, LeNet-5 and VGG NET applied to CIFAR-10,” in 2021 2nd International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE), pp. 414–419, 2021.

M. S. Al-Huseiny and A. S. Sajit, “Transfer learning with GoogLeNet for detection of lung cancer,” Indonesian Journal of Electrical Engineering and computer science, vol. 22, no. 2, pp. 1078–1086, 2021.

S.-H. Chen, Y.-L. Wu, C.-Y. Pan, L.-Y. Lian, and Q.-C. Su, “Breast ultrasound image classification and physiological assessment based on GoogLeNet,” J Radiat Res Appl Sci, vol. 16, no. 3, p. 100628, 2023.

L. Yang et al., “GoogLeNet based on residual network and attention mechanism identification of rice leaf diseases,” Comput Electron Agric, vol. 204, p. 107543, 2023.

N. Yang, Z. Zhang, J. Yang, Z. Hong, and J. Shi, “A convolutional neural network of GoogLeNet applied in mineral prospectivity prediction based on multi-source geoinformation,” Natural Resources Research, vol. 30, no. 6, pp. 3905–3923, 2021.

X. Ding, X. Zhang, J. Han, and G. Ding, “Scaling up your kernels to 31x31: Revisiting large kernel design in cnns,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11963–11975, 2022.

M. Ansari, S. Homayouni, A. Safari, and S. Niazmardi, “A new convolutional kernel classifier for hyperspectral image classification,” IEEE J Sel Top Appl Earth Obs Remote Sens, vol. 14, pp. 11240–11256, 2021.

H. Son, J. Lee, S. Cho, and S. Lee, “Single image defocus deblurring using kernel-sharing parallel atrous convolutions,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2642–2650, 2021.

B. Shiri and D. Baleanu, “All linear fractional derivatives with power functions’ convolution kernel and interpolation properties,” Chaos Solitons Fractals, vol. 170, p. 113399, 2023.

“FER-2013.” Accessed: Jan. 31, 2025. [Online]. Available: https://www.kaggle.com/datasets/msambare/fer2013

A. A. O. Díaz, S. C. Tamayo, D. N. De Oliveira, and G. A. Abensur, “Models for Real-Time Emotion Classification: FER-2013 Dataset,” in Intelligent Systems Conference, pp. 289–304, 2024.

A. M. Obeso, J. Benois-Pineau, M. S. G. Vázquez, and A. Á. R. Acosta, “Visual vs internal attention mechanisms in deep neural networks for image classification and object detection,” Pattern Recognit, vol. 123, p. 108411, 2022.

M. Aly, A. Ghallab, and I. S. Fathi, “Enhancing Facial Expression Recognition System in Online Learning Context Using Efficient Deep Learning Model.,” IEEE Access, vol. 11, pp. 121419–121433, 2023.

H. Ge, Z. Zhu, Y. Dai, B. Wang, and X. Wu, “Facial expression recognition based on deep learning,” Comput Methods Programs Biomed, vol. 215, p. 106621, 2022.

A. B. A. Hassanat et al., “DeepVeil: deep learning for identification of face, gender, expression recognition under veiled conditions,” Int J Biom, vol. 14, no. 3–4, pp. 453–480, 2022.

M. A. Hossain and B. Assiri, “Facial expression recognition based on active region of interest using deep learning and parallelism,” PeerJ Comput Sci, vol. 8, p. e894, 2022.

J. L. Bautista, Y. K. Lee, and H. S. Shin, “Speech emotion recognition based on parallel CNN-attention networks with multi-fold data augmentation,” Electronics (Basel), vol. 11, no. 23, p. 3935, 2022.

N. Vedhamuru, R. Malmathanraj, and P. Palanisamy, “Lightweight deep and cross residual skip connection separable CNN for plant leaf diseases classification,” J Electron Imaging, vol. 33, no. 3, p. 33035, 2024.

K. N. Kumar Tataji, M. N. Kartheek, and M. V. N. K. Prasad, “CC-CNN: A cross connected convolutional neural network using feature level fusion for facial expression recognition,” Multimed Tools Appl, vol. 83, no. 9, pp. 27619–27645, 2024.

Y. Yu, H. Huo, and J. Liu, “Facial expression recognition based on multi-channel fusion and lightweight neural network,” Soft comput, vol. 27, no. 24, pp. 18549–18563, 2023.

M. M. Taye, “Theoretical understanding of convolutional neural network: Concepts, architectures, applications, future directions,” Computation, vol. 11, no. 3, p. 52, 2023.

X. Zhang, X. Zhang, and W. Wang, “Convolutional neural network,” in Intelligent Information Processing with Matlab, pp. 39–71, 2023.

Y. Tian, Y. Zhang, and H. Zhang, “Recent advances in stochastic gradient descent in deep learning,” Mathematics, vol. 11, no. 3, p. 682, 2023.

A. Sclocchi and M. Wyart, “On the different regimes of stochastic gradient descent,” Proceedings of the National Academy of Sciences, vol. 121, no. 9, p. e2316301121, 2024.

S. Pu, A. Olshevsky, and I. C. Paschalidis, “A sharp estimate on the transient time of distributed stochastic gradient descent,” IEEE Trans Automat Contr, vol. 67, no. 11, pp. 5900–5915, 2021.

S.-Y. Zhao, Y.-P. Xie, and W.-J. Li, “On the convergence and improvement of stochastic normalized gradient descent,” Science China Information Sciences, vol. 64, pp. 1–13, 2021.

X. Zhao, L. Wang, Y. Zhang, X. Han, M. Deveci, and M. Parmar, “A review of convolutional neural networks in computer vision,” Artif Intell Rev, vol. 57, no. 4, p. 99, 2024.

A. Patil and M. Rane, “Convolutional neural networks: an overview and its applications in pattern recognition,” Information and Communication Technology for Intelligent Systems: Proceedings of ICTIS 2020, Volume 1, pp. 21–30, 2021.

S. Wei, Y. Chen, Z. Zhou, and G. Long, “A quantum convolutional neural network on NISQ devices,” AAPPS bulletin, vol. 32, pp. 1–11, 2022.

X. Shen et al., “Deepmad: Mathematical architecture design for deep convolutional neural network,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6163–6173, 2023.

S. Mehra, G. Raut, R. Das Purkayastha, S. K. Vishvakarma, and A. Biasizzo, “An empirical evaluation of enhanced performance softmax function in deep learning,” IEEE Access, vol. 11, pp. 34912–34924, 2023.

S. Raghuram, A. S. Bharadwaj, S. K. Deepika, M. S. Khadabadi, and A. Jayaprakash, “Digital implementation of the softmax activation function and the inverse softmax function,” in 2022 4th International Conference on Circuits, Control, Communication and Computing (I4C), pp. 64–67, 2022.

Y. Zhang, L. Peng, L. Quan, Y. Zhang, S. Zheng, and H. Chen, “High-precision method and architecture for base-2 softmax function in DNN training,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 70, no. 8, pp. 3268–3279, 2023.

D. Han et al., “Bridging the divide: Reconsidering softmax and linear attention,” Adv Neural Inf Process Syst, vol. 37, pp. 79221–79245, 2024.

Y. He, “Facial expression recognition using multi-branch attention convolutional neural network,” Ieee Access, vol. 11, pp. 1244–1253, 2022.

Y. Xie, H. Chen, Y. Ma, and Y. Xu, “Automated design of CNN architecture based on efficient evolutionary search,” Neurocomputing, vol. 491, pp. 160–171, 2022.

A. Kashef and J. Ma, “Real-time Facial Emotion Recognition Using FER2013 Image Dataset,” in IIE Annual Conference. Proceedings, Institute of Industrial and Systems Engineers (IISE), pp. 1–9, 2023.

R. Nirthika, S. Manivannan, A. Ramanan, and R. Wang, “Pooling in convolutional neural networks for medical image analysis: a survey and an empirical study,” Neural Comput Appl, vol. 34, no. 7, pp. 5321–5347, 2022.

Y.-D. Zhang, S. C. Satapathy, S. Liu, and G.-R. Li, “A five-layer deep convolutional neural network with stochastic pooling for chest CT-based COVID-19 diagnosis,” Mach Vis Appl, vol. 32, pp. 1–13, 2021.

D. Grattarola, D. Zambon, F. M. Bianchi, and C. Alippi, “Understanding pooling in graph neural networks,” IEEE Trans Neural Netw Learn Syst, vol. 35, no. 2, pp. 2708–2718, 2022.

J.-J. Liu, Q. Hou, Z.-A. Liu, and M.-M. Cheng, “Poolnet+: Exploring the potential of pooling for salient object detection,” IEEE Trans Pattern Anal Mach Intell, vol. 45, no. 1, pp. 887–904, 2022.

A. Wang, H. Chen, Z. Lin, J. Han, and G. Ding, “Repvit: Revisiting mobile cnn from vit perspective,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15909–15920, 2024.

S. R. Waheed, M. S. M. Rahim, N. M. Suaib, and A. A. Salim, “CNN deep learning-based image to vector depiction,” Multimed Tools Appl, vol. 82, no. 13, pp. 20283–20302, 2023.

G. Meena, K. K. Mohbey, A. Indian, M. Z. Khan, and S. Kumar, “Identifying emotions from facial expressions using a deep convolutional neural network-based approach,” Multimed Tools Appl, vol. 83, no. 6, pp. 15711–15732, 2024.

“FER2013 Dataset | Papers With Code.” Accessed: Jan. 31, 2025. [Online]. Available: https://paperswithcode.com/dataset/fer2013

L. Zahara, P. Musa, E. P. Wibowo, I. Karim, and S. B. Musa, “The facial emotion recognition (FER-2013) dataset for prediction system of micro-expressions face using the convolutional neural network (CNN) algorithm based Raspberry Pi,” in 2020 Fifth international conference on informatics and computing (ICIC), IEEE, 2020, pp. 1–9.

A. R. Khan, “Facial emotion recognition using conventional machine learning and deep learning methods: current achievements, analysis and remaining challenges,” Information, vol. 13, no. 6, p. 268, 2022.

B. E. Santoso and G. P. Kusuma, “Facial emotion recognition on FER2013 using VGGSPINALNET,” J Theor Appl Inf Technol, vol. 100, no. 7, pp. 2088–2102, 2022.

G. Lu et al., “Video-based neonatal pain expression recognition with cross-stream attention,” Multimed Tools Appl, vol. 83, no. 2, pp. 4667–4690, 2024.

C. Huang, “Face recognition algorithm based on improved neural network,” in Second Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Big Data Forum (AIBDF 2022), pp. 110–116, 2023.