- Iñigo Aduna (r0973686)
- Ahmad BeigRezaei (r0969764)
- Minoo Safaeikouchaksaraei (r0972740)
- Floris De Feyter: floris.defeyter@kuleuven.be
The dataset is formed by eight different classes (Angry, Disgust, Fear, Happy, Sad, Surprise, Neutral, Contempt) by merging these 3 datasets.
Name | Description |
---|---|
Data_Integration | This notebook illustrates the features of the original datasets from our projects and integrates them into a consolidated dataset. |
The models used in this project are pre-trained on the ImageNet dataset for object detection in images of size 224x224x3. To adapt our dataset to these models, we have implemented several image processing steps.
Name | Description |
---|---|
Data_Analysis | This section conducts data analysis on the integrated dataset to finalize the proposed dataset. |
Preprocessing | The preprocessing phase converts image formats from grayscale to RGB, ensuring proper distribution of values across channels. It converts .png images to .jpg and applies data augmentation techniques. |
Corruption | To evaluate how blurring affects images at different severity levels, we will use the blur function (img, s) from ImageNet-C. We will generate six distinct test sets, one for each level of severity corruption. |
We utilized the VGG-16 architecture for our emotion classification model. Below are the notebooks used in this stage:
Name | Description |
---|---|
on_AffectNetFinal-Blurring-lv4.ipynb | Implements and trains the VGG-16 architecture using data augmentation techniques with blurring to enhance the model's robustness against image variations. |
on_AffectNet-Final-nonblur.ipynb | Details the implementation and training of VGG-16 without blurring techniques, providing a comparison point to evaluate the effectiveness of data augmentation with blurring. |
VGG16_Final_VSC_Server.ipynb | Consolidates the final results of the VGG-16 training, including performance metrics and visualization of the results. |
VGG16_BinarryClassification_AffNetBinary.ipynb | Implements and trains the VGG-16 architecture on binary classification task of AffNetBinary Dataset. |
Name | Description |
---|---|
Inferance.ipynb | Load our trained models from saved .pth files and make inferance of the model. |
This project is based on several research papers and code repositories that are pivotal for the advancement of attention mechanisms and robustness in artificial intelligence models. Below are the key references used:
-
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). "Attention Is All You Need." Neural Information Processing Systems (NIPS). Available at: https://arxiv.org/abs/1706.03762
-
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020). "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale." International Conference on Learning Representations. Available at: https://openreview.net/forum?id=YicbFdNTTy
-
Li, S., Zhao, W., & Roy-Chowdhury, A. K. (2021). "Vision Transformers for Facial Emotion Recognition." Conference on Computer Vision and Pattern Recognition. Link not available.
-
Vision Transformers for Facial Emotion Recognition: A repository implementing Vision Transformers for facial emotion recognition.
-
Imagenet -C : This repository contains implementations and techniques to enhance the robustness of machine learning models through data augmentation.