Question: Whole Genome Sequencing Data Annotation
0
gravatar for ashley_hertzog
26 days ago by
ashley_hertzog0 wrote:

Hello experts,

My centre is working on an analysis pipeline for whole genome sequencing data. The sequencing and alignment are being performed off-site and my centre will be receiving VCF files for annotation and curation of variants.

There is little to no literature on validating pipelines for WGS. If anyone has any, would you kindly share?

Does anyone have a proposed pipeline for annotation? We will be using Alissa 5.3 Interpret and were thinking of initially filtering variants out by read depth and then sorting them into variant type (SNV, CNV, and SV). Or would it be better to have two separate pipelines for annotation? One for CNVs and one for SNVs?

Following the variant type filter for CNVs, would it then make sense to sort them by size (> or < 5kb)?

I was just hoping to bounce some ideas back and forth as this is a first for our centre and we currently do not have access to a bioinformatician.

Thank you for any and all help!

ADD COMMENTlink modified 25 days ago • written 26 days ago by ashley_hertzog0
1

https://github.com/imgag/megSAP - this is how we done it at our clinics

ADD REPLYlink written 25 days ago by German.M.Demidov1.8k

Have you checked gatk tool?

[https://gatk.broadinstitute.org/hc/en-us][1]

ADD REPLYlink written 26 days ago by Mehmet580

Sarek which is a Nextflow pipeline is quite nice for WGS analysis.

ADD REPLYlink written 25 days ago by husensofteng260

We will be using Alissa 5.3 I

Looks like you are planning to use a commercial tool for the annotation of VCF. If you have no command line expertise/access to unix servers then this may be the way to go. All the tools being mentioned in this thread will require you to have access to and some expertise with command line.

There is little to no literature on validating pipelines for WGS.

Since you are not going to do primary analysis of data there is no validation of that part of the pipeline. There is literature available for pipeline validation (paper1, paper2 etc).

GDC has a defined DNAseq analysis pipeline. GATK best practices workflows are a good place to start as well.

ADD REPLYlink modified 25 days ago • written 25 days ago by genomax87k

Hi everyone,

You've given me a bit more direction and I'm feeling less lost now. I definitely do not have command line expertise or access to unix servers. The purpose of this exercise is to first implement the variant annotation in a research setting to then be transitioned into a diagnostic workflow for rapid WGS for acute care patients.

Really appreciate the feedback!

ADD REPLYlink written 25 days ago by ashley_hertzog0
1

I can also give an ad about the tool I wrote during my PhD studies - https://github.com/imgag/ClinCNV . It is to detect CNVs in clinical settings (1KB for 30x, one can go into higher resolution with higher coverage, but not more than 500bp I'd say - files become huge). It works in maybe 4 hospitals as for now. It can be not the best tool in terms of precision/recall (but it is surely decent) - but I got a massive feedback from clinicians and was implementing everything they asked me. Several hundreds of patients were diagnosed with it in our clinics only.

Here is the presentation: https://github.com/imgag/ClinCNV/blob/master/doc/ClinCNV_thesis_presentation.pdf

ADD REPLYlink written 25 days ago by German.M.Demidov1.8k
3
gravatar for Shalu Jhanwar
25 days ago by
Shalu Jhanwar400
Switzerland
Shalu Jhanwar400 wrote:

SnpEff (http://snpeff.sourceforge.net/VCFannotationformat_v1.0.pdf), ANNOVAR https://www.nature.com/articles/nprot.2015.105, and Variant Effect Predictor https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0974-4 are some of the most commonly used tools for variant annotation.

ADD COMMENTlink written 25 days ago by Shalu Jhanwar400
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 851 users visited in the last hour