Institutional Repository

Investigation of hierarchical deep neural network structure for facial expression recognition

Show simple item record

dc.contributor.author Motembe, Dodi
dc.date.accessioned 2021-05-31T10:35:30Z
dc.date.available 2021-05-31T10:35:30Z
dc.date.issued 2020-01
dc.identifier.uri http://hdl.handle.net/10500/27389
dc.description.abstract Facial expression recognition (FER) is still a challenging concept, and machines struggle to comprehend effectively the dynamic shifts in facial expressions of human emotions. The existing systems, which have proven to be effective, consist of deeper network structures that need powerful and expensive hardware. The deeper the network is, the longer the training and the testing. Many systems use expensive GPUs to make the process faster. To remedy the above challenges while maintaining the main goal of improving the accuracy rate of the recognition, we create a generic hierarchical structure with variable settings. This generic structure has a hierarchy of three convolutional blocks, two dropout blocks and one fully connected block. From this generic structure we derived four different network structures to be investigated according to their performances. From each network structure case, we again derived six network structures in relation to the variable parameters. The variable parameters under analysis are the size of the filters of the convolutional maps and the max-pooling as well as the number of convolutional maps. In total, we have 24 network structures to investigate, and six network structures per case. After simulations, the results achieved after many repeated experiments showed in the group of case 1; case 1a emerged as the top performer of that group, and case 2a, case 3c and case 4c outperformed others in their respective groups. The comparison of the winners of the 4 groups indicates that case 2a is the optimal structure with optimal parameters; case 2a network structure outperformed other group winners. Considerations were done when choosing the best network structure, considerations were; minimum accuracy, average accuracy and maximum accuracy after 15 times of repeated training and analysis of results. All 24 proposed network structures were tested using two of the most used FER datasets, the CK+ and the JAFFE. After repeated simulations the results demonstrate that our inexpensive optimal network architecture achieved 98.11 % accuracy using the CK+ dataset. We also tested our optimal network architecture with the JAFFE dataset, the experimental results show 84.38 % by using just a standard CPU and easier procedures. We also compared the four group winners with other existing FER models performances recorded recently in two studies. These FER models used the same two datasets, the CK+ and the JAFFE. Three of our four group winners (case 1a, case 2a and case 4c) recorded only 1.22 % less than the accuracy of the top performer model when using the CK+ dataset, and two of our network structures, case 2a and case 3c came in third, beating other models when using the JAFFE dataset. en
dc.language.iso en en
dc.subject Facial Expression Recognition (FER) en
dc.subject Deep Learning en
dc.subject Convolutional Neural Network (CNN) en
dc.subject Deep Convolutional Neural Network (DCNN) en
dc.subject Artificial Intelligence en
dc.subject Hierarchical Deep Neural Network Structure en
dc.subject Face Detection en
dc.subject Facial Feature Extraction en
dc.subject Central Processing Unit (CPU) en
dc.subject Graphics Processing Unit (GPU) en
dc.title Investigation of hierarchical deep neural network structure for facial expression recognition en
dc.type Dissertation en
dc.description.department Electrical and Mining Engineering en


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search UnisaIR


Browse

My Account

Statistics