How to connect DELLY output to ShatterSeek input
0
0
Entering edit mode
2.2 years ago
Xiaofan ▴ 10

Hi there,

I am using ShatterSeek to infer potential chromothripsis. It seems that the original article of ShatterSeek used output of DELLY as input for structual variation. However, when I read the vignette of ShatterSeek, the demo data seems to have two chromosome breakpoint (below is the input format of structural variation for ShatterSeek), while the output of DELLY only provided one.

chrom1  
(character): chromosome for the first breakpoint

pos1    
(character): position for the first breakpoint

chrom2  
(character): chromosome for the second breakpoint

pos2    
(character): position for the second breakpoint

SVtype  
(character): type of SV, encoded as: DEL (deletion-like; +/-), DUP (duplication-like; -/+), h2hINV (head-to-head inversion; +/+), and t2tINV (tail-to-tail inversion; -/-).

strand1 
(e.g. + for DEL)

strand2 
(e.g. - for DEL)

I am not sure if the output of DELLY still need further annotation to get such format for ShatterSeek? Even if I have a TCHR and START column in DELLY output, but most of DUP and DEL types of SV do not have any values in these column. I am totally lost now, anyone any suggestions?

Many thanks in advance.

DELLY ShatterSeek Structural variation chromothripsis • 806 views
ADD COMMENT
0
Entering edit mode

the demo data seems to have two chromosome breakpoint while the output of DELLY only provided one.

This is wrong, see the INFO/CHR2 annotation. eg. https://github.com/VCCRI/SVPV/blob/master/example/delly.vcf#L509

chr12 71315481 INV00010872 A <INV> . PASS PRECISE;SVTYPE=INV;SVMETHOD=EMBL.DELLYv0.7.3;CHR2=chr12;END=71316542;INSLEN=0;PE=68;MAPQ=60;CT=5to5;CIPOS=-42,42;CIEND=-42,42;SR=33;SRQ=1;CONSENSUS=GAGGAGGCCAGAGGTTGGGTAAACAGGGCCTGGCTGAGGTGTGTTGGCTCTACTGAGTGGATTTCTGCCTGCCACCTCATTGCTCTATTTGCAGCCTCATCCCAACCCCAGGCAGCAGTTAAAGAGAGAACAGGAGTAAAAATTAACAGG;CE=1.99026;RDRATIO=1.2289;AC=2;AN=6 (...)

ADD REPLY
0
Entering edit mode

Thank you for pointing this out. But I have a different file format:

Chr Start   End Ref Alt GeneName    Func    Gene    GeneDetail  ExonicFunc  AAChange    Gencode cpgIslandExt    cytoBand    genomicSuperDups    Repeat  dgvMerged   RESOLUTION  CIEND   CIPOS   CT  INSLEN  PE  SR  SRQ MAPQ    GT  GL  GQ  FT  RCL CONSENSUS   CE  RC  RCR RDCN    DR  DV  RR  RV  SVID    TCHR    TSTART  SVType
chr18   15967778    18038438    0   0   .   intergenic  NR_027417,NONE  dist=641859;dist=NONE   .   .   .   .   18p11.1 .   Score=1138;Name="2099323:ALR/Alpha(Satellite)"  .   IMPRECISE   -89,89  -89,89  3to5    0   3   0   0   22  0/1 -2.68602,0,-40.5925 27  PASS    151664  na  na  28513   14563   0   11  3   0   0   DEL00039120 na  na  DEL

You can see I can just find a TCHR and TSTART of which values are both na. Additionally there is no TEND.

ADD REPLY

Login before adding your answer.

Traffic: 1686 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6