Our goal is to build a CNN model that can identify future instances of a given sea turtle based on the images the model is trained on. The model will identify individual sea turtles based on the unique segmeneted patterns on their heads. To train and test our model, we use the SeaturtleIDHeads dataset.
Note: Each row represents photos of a unique turtle.
Investigating the effects of tourism and social media on wildlife animals can be done by using CNN models to identify the animals on social media platforms and tracking the frequency at which they are posted (Papafitsoros, Adam, and Schofield 2023).
We will first train our model on the well-established CIFAR-10 dataset and measure its performance. We will call this model the CIFAR-10 CNN model. After, we use the same network architecture to train two models using the SeaturtleIDHeads dataset, the first model splits the data randomly in the training and testing datasets (Random-Split model) and the other splits the data after a certain date-time (Time-cutoff Split model). We do this so we can compare how well our Time-cutoff Split model performs when trying to identify future instances of any given sea turtle compared to the Random-Split model. The CIFAR-10 dataset consists of a total of 6,000 32 × 32 images such that they can be classified into 10 categories: dog, cat, deer, frog, horse, bird, plane, car, truck and ship. Since this is a standard dataset used to practice different CNN models, we build our model using this dataset first to gauge performance.From the dataset of size ~8,000, we take a subset of data of size 10 corresponding to the turtles with the most images. We also reduce the size of the images to 32×32 to reduce the computational burden.
CIFAR-10 model: 82.25%
SeaturtleIDHeads Random Split model: 62.01%
SeaturtleIDHeads Time-cutoff Split model: 37.85%
Further analysis of the models are described in the research paper.
conda install pytorch pandas numpy seaborn
To get the sea turtle dataframe, go on the SeaturtleIDHeads dataset page and click 'Edit my Copy' to copy the page.
Patrick Loeber provided the general structure of the model class as well as the training and testing loops.
GeekAlexis has a repo containing code on how to plot the loss curves and how to print the average training and validation loss each epoch. Implementations such as learning rate scheduler, batch normalisation and dropout were all inspired by this repo.
Konrad Szafer provided the initial code needed to create a custom dataset.
Sahar Millis provided code that plots a confusion matrix.