Question: tool to compare sv between two files
gravatar for Medhat
3.5 years ago by
Medhat8.8k wrote:

I called structure variant using Sniffles for two samples (using PacBio reads), I would like to compare the resulted two files to know what is common and different in both files.

what first come to my mind is to use vcf-isec, but I do not know how it will deal with the deletion size if they are exist in the two files but different in length for example, translocation also if its origin is the same but the location transferred to is different! etc ..

vcf-compare could be used but I do not know if it is the right way!

is there is a tool that could do this in an efficient way? anyone have experience with that?


ADD COMMENTlink modified 3.5 years ago by Zev.Kronenberg11k • written 3.5 years ago by Medhat8.8k
gravatar for Zev.Kronenberg
3.5 years ago by
United States
Zev.Kronenberg11k wrote:

One simple way to do this is to merge the SV calls and look at which calls collapse. I've writing some code to do this in VCF format.

ADD COMMENTlink written 3.5 years ago by Zev.Kronenberg11k

I merged them before using vcf-merge it gives a clear result when variant exist in one file but absent from the other file, but when there is variant in the same position (same type of variant ex. DUP or different ex. INV, TRV then it become an issue), I will try the tools and give feed back. Thanks.

ADD REPLYlink modified 3.2 years ago • written 3.5 years ago by Medhat8.8k

First in installation I have this warning:

src/mergeSVcallers.cpp: In function ‘void manageLoopOverVar(std::vector<vcflib::Variant*>&)’:
src/mergeSVcallers.cpp:603:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
             if(i == tmpdata.size()){

and there is some warning and it skipped TRA

WARNING: TRA events are skipped  

WARNING: could not set region: seqid: 1 file: variant1_sort.vcf.gz
INFO: Seqid might not be in file
INFO: sorting: seqid: 1
n SVs in chunk: 7

in this fields:

CIPOS,Number=2, Type=Integer,Description="Confidence interval around POS for imprecise variants CIEND,Number=2, Type=Integer,Description="Confidence interval around END for imprecise variants

I have them from -10 to 10 as seen below


what is this means? (does it means that there is 10 reads supporting this results?)

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by Medhat8.8k

It is simply a confidence interview around the start and end of the SV. By default I set it to ten. When multiple SVs are merged it gets wider.

ADD REPLYlink written 3.5 years ago by Zev.Kronenberg11k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1894 users visited in the last hour