Normalize by counting CNV within a pool of VCF files
0
0
Entering edit mode
5.9 years ago
Tintest • 0

Hello,

I’m (trying) using the GATK4 germline CNV calling pipeline. I successfully got 57 VCFs from my sample batch, called with segments (obtained by merging the contiguous intervals), like in a classic VCF :

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  2046745451-1006_S4
M       3288    CNV_M_3288_15907        N       <DEL>,<DUP>     .       .       END=15907       GT:CN:NP:QA:QS:QSE:QSS  2:5:9:17:6:21:21
1       69071   CNV_1_69071_70028       N       <DEL>,<DUP>     .       .       END=70028       GT:CN:NP:QA:QS:QSE:QSS  1:0:1:204:204:204:204

But I got way too much of those intervals, more than 10k. I would like to know if there is an existing tool which count the different segments (variants / intervals common by +/= 75% of their length) in one VCF and gives me the count of the different segments overlapped by segments in other sample in my batch. By counting the most redundant segment, I could determine which are background noise and maybe decrease the number of variants in my VCF by filtering.

Thank you.

CNV GATK4 VCF • 960 views
ADD COMMENT

Login before adding your answer.

Traffic: 2670 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6