Hail-Annotate is a Python-based wrapper for Hail which is designed to streamline the annotation of human genetic variants. Users will need to provide an input variant call format (VCF) file, and will receive output annotated with allele frequency information from the GnomAD database.
The following steps are required to launch an annotation task:
- Ensure you have a local installation of the Hail library.
- Configure a Google Cloud service account - setup guide.
- Edit the config.json to configure your analysis.
- Upload your VCF to Google cloud.
- Launch an annotation using:
hailctl dataproc submit gnomad-test /local/path/to/hail_annotation.py \
--config gs://bucket/config.json \
--region <region, ie us-west-1>
This workflow assumes that your service account has storage.objectAdmin
permission to a bucket used for input and output.
Full documentation (including a guide on setting up your Google Cloud analysis) can be found at https://bbowles1.github.io/Hail-Annotate/index.html.