Skip to content

Latest commit

 

History

History
26 lines (16 loc) · 1.36 KB

File metadata and controls

26 lines (16 loc) · 1.36 KB

assignment-3-modeling-attn

For this assignment, we are continuing to design modeling tasks to help you gain a deeper understanding of transformer's component modules.

Specifically, we will focus on one of the pivotal layers that form the backbone of the transformer structure: the attention layer.

Tasks

Task 1: Offline Sliding-Window Attention

Please read the description here.

Task 2: Online Sliding-Window Attention

Please read the description here.

Environment

  • You should have python 3.10+ installed on your machine.
  • (Optional) You had better have Nvidia GPU(s) with CUDA12.0+ installed on your machine, otherwise some features may not work properly (We will do our best to ensure that the difference in hardware does not affect your score.).
  • You are supposed to install all the necessary dependencies with the following command, which may vary a little among different assignments.
    pip install -r requirements.txt
  • (Optional) You are strongly recommended to use a docker image from Nvidia Pytorch Release like 23.10 or some newer version as your basic environment in case of denpendency conflicts.