For this assignment, we are continuing to design modeling tasks to help you gain a deeper understanding of transformer's component modules.
Specifically, we will focus on one of the pivotal layers that form the backbone of the transformer structure: the attention layer.
Please read the description here.
Please read the description here.
- You should have python 3.10+ installed on your machine.
- (Optional) You had better have Nvidia GPU(s) with CUDA12.0+ installed on your machine, otherwise some features may not work properly (We will do our best to ensure that the difference in hardware does not affect your score.).
- You are supposed to install all the necessary dependencies with the following command, which may vary a little among different assignments.
pip install -r requirements.txt
- (Optional) You are strongly recommended to use a docker image from Nvidia Pytorch Release like 23.10 or some newer version as your basic environment in case of denpendency conflicts.