Tutorial:Germline variant calling pipeline using Snakemake
1
3
Entering edit mode
4 weeks ago
nhaus ▴ 100

Hello everybody,

as part of a project, I had to write an in-house pipeline to call germline mutations for ~100 patients.

For that I used Snakemake and GATKs best practice guidelines. Steps that take a long time (HaplotypeCaller or BaseQualityScoreRecalibration) are automatically parallelized over genomic intervals.

Furthermore, I tried to document the requirements to run the pipeline on your own as extensively as possible, and also included links, where to download gold standard reference material, so it is easy to use for people without a lot of experience.

I hope this is useful for anyone who is also trying to perform germline variant calling.

If you have any questions or improvements for the pipeline, please let me know.

You can find the project here:

https://github.com/nickhir/GermlineMutationCalling

Cheers!

snakemake GATK germline_variant_calling • 262 views
5
Entering edit mode
4 weeks ago

Nice! A small suggestion... Instead of listing programs in the Installation section you could provide a requirements.txt file listing those programs, like:

snakemake >=6
samtools =1.10
...etc


then tell the user to set up the conda environment with:

conda create -n GermlineMutationCalling
conda activate GermlineMutationCalling
mamba install --file requirements.txt # Use mamba, much better than conda


if someone doesn't want to use conda, the requirements.txt is still useful.

I think this is a lot easier for both users and developers.