Skip to content

Commit

Permalink
edits to ep 2
Browse files Browse the repository at this point in the history
  • Loading branch information
tylermcinnes committed Oct 21, 2024
1 parent d2ba594 commit 79a7230
Showing 1 changed file with 5 additions and 6 deletions.
11 changes: 5 additions & 6 deletions docs/02-data-prelude.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,13 +89,12 @@ strains Cit+. Ultimately, we will use R to answer these questions:
## How VCF files are generated

Publicly accessible sequencing files in FASTQ formats can be downloaded
from NCBI SRA. However, at FASTQ files contain unaligned sequences of
varying quality, and requires clean up and alignment steps for variants
to be called from the reference genome.
from NCBI SRA. However, FASTQ files contain unaligned sequences of
varying quality, and require clean up and alignment steps before variants
can be called from the reference genome.

There are five steps we must take to transform raw FASTQ files into variant calls (VCF files). At each of the five steps we will be using specialized, non-R based bioinformatics tools:

Five steps are taken to transform FASTQ files to variant calls contained
in VCF files and at each step, specialized non-R based bioinformatics
tools that are used:

<figure markdown>
![image](figures/variant_calling_workflow.png){ width="250" }
Expand Down

0 comments on commit 79a7230

Please sign in to comment.