Question: annotation of SV (Structural Variants)
3
gravatar for Bogdan
2.5 years ago by
Bogdan850
Palo Alto, CA, USA
Bogdan850 wrote:

Dear all,

for a list of Structural Variants (including deletions, duplications, inversions, translocations), either in VCF or BEDPE format, we would like to have the gene annotations, and the lists of the following sets of genes :

-- fusions (if both breakpoints are in exons, introns, utrs) -- truncations (if only one breakpoint is in exon, intron, utr; and the other breakpoint is in intergenic area) -- the genes in the areas that are deleted, duplicated, inverted

Although I wrote some scripts in perl based on Annovar , thought that we could get all these annotations with a package that is already available ?

thanks a lot,

-- bogdan

ADD COMMENTlink modified 14 months ago by LGMgeo90 • written 2.5 years ago by Bogdan850

Dear Daniel, these are very good suggestions, thank you ! 'm planning to use StructuralVariantAnnotation and compare the results with those derived from my Perl scripts.

Our work is primarily related to SOMATIC SV (in pediatric cancers), and thought that I can ask you please : any recommendations regarding the SV callers to use ? i've started with DELLY, LUMPY, and MANTA and now I cam comparing the results.

also, 've read your paper and work on GRIDSS, it looks great ;) although it seems that the focus has been more on germline calls ;)

ADD REPLYlink written 2.5 years ago by Bogdan850
2

The GRIDSS paper focused on germ-line results, but most of our applications have been in cancer genomics and GRIDSS did manage to win the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (SV sub-challenge #5).

See https://github.com/PapenfussLab/gridss/blob/master/example/somatic.sh for very basic tumour/normal somatic variant calling using GRIDSS.

ADD REPLYlink written 2.5 years ago by d-cameron2.1k

thanks, Daniel, i could run GRIDSS as soon as our new PBS cluster is completely configured.

also please may I ask, what filtering criteria would you recommend for SV ? particularly AF, or number of SR and PR.

and, if you do not mind me asking, after Somatic Mutation Challenge, beside DELLY, MANTA and GRIDSS, which other algorithms did reasonably well ?

ADD REPLYlink written 2.5 years ago by Bogdan850
1

Somatic calling Leaderboard results are publicly available at https://www.synapse.org/#!Synapse:syn312572/wiki/61509

ADD REPLYlink written 2.5 years ago by d-cameron2.1k

Dear Daniel, thank you for the information on SV calling. Considering your experience with all SV callers, and the nice ROC curves from your publication, may I ask please :

-- about filtering, would you have please strong recommendation about the numerical values for Allele Fraction, number of PAIRED-READS or SPLIT-READS ?

-- probably using 2-3 SV callers may offer less False Negatives than using only 1 SV caller. And if it is so, beside GRIDSS, which other Sv caller(s) would you recommend ?

thanks a lot for sharing your experience with us !

ADD REPLYlink written 2.5 years ago by Bogdan850
1

Please do not add answers unless you're answering the top level question. If you're replying to someone, use the Add Comment or Add Reply options. I'm moving your "answer"s to comments now.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by RamRS24k

thanks, Ram ;) a pretty exciting conversation, I shall say ;)

ADD REPLYlink written 2.5 years ago by Bogdan850

Sure, but multiple answers that do not answer the question confuse people. Please check out https://www.biostars.org/t/how-to/ for more information. Thank you!

ADD REPLYlink written 2.5 years ago by RamRS24k

ok ;) thank you, Ram ;)

ADD REPLYlink written 2.5 years ago by Bogdan850

How did you do with this? I'm developing a pipeline that works well for me if you're still looking for help

ADD REPLYlink written 2.2 years ago by nr2380
6
gravatar for d-cameron
2.5 years ago by
d-cameron2.1k
Australia
d-cameron2.1k wrote:

SVs are problematic for many pipelines/software as, unlike SNVs and small indels, each event involves at least two genomic loci.

Be aware that not all callers correctly classify events. Many callers will classify events purely on their break-end position and orientation. This results in deletion calls even when there is no copy number change to support the event (most callers), or an inversion calls even when only one of the two inversion breakpoints actually exist (e.g. DELLY). For simple germline analysis this is probably ok, and you can just ignore all large or inter-chromosomal events but for highly rearranged genomes (eg cancer), things are much more complicated.

thought that we could get all these annotations with a package that is already available

What you're asking is really two separate processes: one for looking at the intervening sequence of simple events, and another for break-end overlap for fusions/interchromosomal/complex events.

If you're familiar with BioConductor then you can do the first part relatively easily for a BEDPE: just convert to GRanges intervals and calculate overlaps against the BioConductor annotation package for your organism.

For the second part you might be interested in my StructuralVariantAnnotation package. It's key feature is conversion of VCFs generated by a number of popular SV callers into a GRanges object containing break-end coordinates. Once in GRanges format, you can again use the BioConductor annotation packages to calculate feature overlap.

ADD COMMENTlink written 2.5 years ago by d-cameron2.1k
0
gravatar for LGMgeo
14 months ago by
LGMgeo90
European Union
LGMgeo90 wrote:

I suggest using AnnotSV for SV annotation (annotation with gene names and locations, OMIM, DGV, 1000g, haploinsufficiency, TAD, ... and also with your own in-house information).

AnnotSV constructs an annotation based on the full-length SV but also an annotation for each gene within the SV. You will so have access to :

  • all the overlapped genes information (ID, OMIM...)

  • the SV location within each overlapped gene (e.g. "exon3-intron11", "txStart-intron19", ...). You could so determine fusion or truncation events.

Input format: VCF or BED

You can look at this post describing the annotSV tool: Annotation for SV and CNV

ADD COMMENTlink written 14 months ago by LGMgeo90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 934 users visited in the last hour