Question

Structural Variations (Sv) Annotation Workflow

5

Entering edit mode

12.2 years ago

Doctoroots ▴ 800

I want to know if there is some kind of pipeline i can use in order to annotate a given list of structural variants detected by NGS.

specifically, i just ran Breakdancer on a paired end sequencing data and recieved the output specifying the different types of predicted variations (insertions / deletions / inversions / translocations) example :

1 1413910 9+0- 1 1452528 0+10- DEL 38636 99

1 1588571 0+17- 1 1653965 17+0- ITX 64960 99

the columns are : chr1, position1, supportingreads(forward/reverse), chr2, position2, supportingreads(forward/reverse), variationtype, variationsize, quality

i basically want to annotate these variants in regards to gene name, exon/intron, transcription binding sites, ncRNA (miRNA) and any other known feature regarding the genetic region affected by them.

is there a known tool for this? if not, i would appreciate any suggestions on how to perform such a task (for example some sort of outline for a workflow i can build)

thanks.

structural annotation workflow • 6.7k views

ADD COMMENT • link updated 12.2 years ago by Raymond301 ▴ 160 • written 12.2 years ago by Doctoroots ▴ 800

1

Entering edit mode

How did you end up annotating your SV data. I find myself in the same kind of situation.

ADD REPLY • link 11.4 years ago by William ★ 5.3k

0

Entering edit mode

SV annotation (with OMIM, DGV, 1000g, haploinsufficiency, TAD, ... and also with your own in-house information) can be easily automated !

You can look at this post describing the annotSV tool: Annotation for SV and CNV

ADD REPLY • link 5.8 years ago by LGMgeo ▴ 100

score 1 · Answer 1 · 2012-03-09

1

Entering edit mode

12.1 years ago

Bioscientist ★ 1.7k

Bedtools may help you

ADD COMMENT • link 12.1 years ago by Bioscientist ★ 1.7k

Ram · Answer 2 · 2012-03-09

1

Entering edit mode

12.1 years ago

Raymond301 ▴ 160

There happens to be a complete pipeline that results in IGV visualization as well as flat files for tertiary analysis.

Look at this publication:

http://bioinformatics.oxfordjournals.org/content/early/2011/11/14/bioinformatics.btr612.full.pdf

Can be downloaded Here:

http://ndc.mayo.edu/mayo/research/biostat/stand-alone-packages.cfm

--> This is a front to back full pipeline. However you can see that there are option capabilities to run from any step. Either all, just alignment, just variants, just annotation.

** It's important to note that this pipeline is based on using .BAMS and .vcf as standard file types for modular employment. You'll need to refer to pac-bio or some other source for converters.

ADD COMMENT • link updated 4.6 years ago by Ram 43k • written 12.1 years ago by Raymond301 ▴ 160

0

Entering edit mode

*** Note that there is also a Cloud deployment in case you don't have the computational resources to run this pipeline.

ADD REPLY • link 12.1 years ago by Raymond301 ▴ 160

0

Entering edit mode

hi raymond, although the tool looks useful for other purposes i couldnt see how it handles structural variants such as translocations and inversions

ADD REPLY • link 12.1 years ago by Doctoroots ▴ 800

0

Entering edit mode

You are correct. Upon closer inspection translocations & inversions are not readily annotated through this software. I apologize.

ADD REPLY • link 12.1 years ago by Raymond301 ▴ 160

score 0 · Answer 3 · 2012-03-06

I have basically the same question. I have to create a SV pipeline as a project on my master. We also have to include some sort of annotation. For now we annotate the output via a GFF file. The GFF contains all the positions of genes, exons, introns etc. We compare these positions with our SV list.

Furthermore we compare SV calls from different software tools to see which SV are supported by multiple SV callers.

I am really wondering what other people's thoughts are about annotating SV's.