COVID-QA is a powerful question answering system based on the Longformer model, designed specifically for COVID-19 related research. This repository contains the code and resources necessary to train and deploy the Longformer model for COVID-19 question answering tasks.
- Utilizes the Longformer model, which overcomes the limitations of traditional transformers in processing long sequences of text.
- Fine-tuned on the COVID-QA dataset, consisting of 2,019 question/answer pairs annotated by biomedical experts on scientific articles related to COVID-19.
- Implements a sliding window local attention mechanism to handle sequences containing thousands of tokens.
- Achieves state-of-the-art performance in COVID-19 question answering, providing accurate and contextually rich answers.
- Python 3.7 or higher
- Jupyter Notebook
- PyTorch
- Transformers library
- Clone this repository to your local machine.
- Open the
covid_longformer_qa_training.ipynb
file in Jupyter Notebook. - All requirements installed through pip in the Jupyter notebook.
- Follow the instructions provided in the notebook to train and evaluate the Longformer model.
- Customize the notebook to suit your specific needs or use the pre-trained model for inference.
The COVID-QA dataset consists of 2,019 question/answer pairs annotated by volunteer biomedical experts on scientific articles related to COVID-19. The dataset focuses on 147 scientific articles from the CORD-19 dataset, providing a specialized resource for training the Longformer model in COVID-19 question answering.
- Achieved an F1-score of 93.15 and an Exact Match score of 86.92 on the COVID-QA dataset, surpassing previous benchmarks.
- Outperformed other transformer-based models like RoBERTa in COVID-19 question answering tasks.
- Provided accurate and contextually rich answers to COVID-19 queries, helping advance COVID-19 research and awareness.
- https://ihiratanveer.medium.com/unleashing-the-power-of-long-sequences-in-nlp-why-can-longformers-be-used-for-question-answering-27099ecb3d1c
- https://ihiratanveer.medium.com/longformer-empowering-covid-19-research-with-advanced-transformer-capabilities-cf70fd3e8509
Contributions to this project are welcome! If you find any issues or have ideas for improvements, please feel free to open an issue or submit a pull request.