extract DP of heterozygotes from vcf file
2
0
Entering edit mode
7.0 years ago
Ana ▴ 200

Hi everyone

I want to plot the read distribution only in heterozygote genotypes in my vcf file. Does anyone has any idea how to extract only DP values of heterozygotes? Thanks

vcffile heterozygotes sample read depth • 2.1k views
ADD COMMENT
0
Entering edit mode
7.0 years ago
mbk0asis ▴ 680

try

grep -Po 'DP=.*' INPUT | cut -d";" -f1

INPUT

chrX    129 .   G   GT,GGT  27173.73    .   AC=1,1;AF=0.500,0.500;AN=2;BaseQRankSum=2.721;ClippingRankSum=-0.400;DP=2555;ExcessHet=3.0103;FS=0.000;MLEAC=1,1;MLEAF=0.500,0.500;MQ=40.46;MQRankSum=-3.139;QD=27.54;ReadPosRankSum=-1.613;SOR=2.163   GT:AD:DP:GQ:PL  1/2:20,98,322:440:99:27211,8108,6549,2362,0,549
chrX    225 .   T   TC,TTGGGC   2133.73 .   AC=1,1;AF=0.500,0.500;AN=2;DP=930;ExcessHet=3.0103;FS=0.000;MLEAC=1,1;MLEAF=0.500,0.500;MQ=40.47;QD=29.39;SOR=2.124 GT:AD:DP:GQ:PL  1/2:0,23,9:32:99:2171,378,207,767,0,744

OUTPUT

DP=2555
DP=930
ADD COMMENT
0
Entering edit mode
7.0 years ago

Using bioalcidae: https://github.com/lindenb/jvarkit/wiki/BioAlcidae

$ java -jar dist/bioalcidae.jar -e 'while(iter.hasNext()) { var vc=iter.next(),i=0;for(i=0;i< vc.getNSamples();++i) { var g=vc.getGenotype(i); if(!g.isHet()) continue; out.println(vc.getContig()+" "+vc.getStart()+" "+g.getSampleName()+" "+g.getDP()+" "+g.getAlleles());}}' in.vcf



1 1149835 Sample1 1042 [A*, G]
1 1149973 Sample1 531 [C*, A]
1 1152069 Sample1 31 [A*, ATGAGACCGCACCAGCGTGTC]
1 1152369 Sample1 272 [G*, A]
1 1152369 Sample2 65 [G*, A]
1 1152431 Sample1 298 [A*, G]
1 1152689 Sample2 16 [C*, A]
1 1152689 Sample3 9 [C*, A]
1 1152689 Sample4 11 [C*, A]
1 1152689 Sample5 [C*, A]
ADD COMMENT

Login before adding your answer.

Traffic: 1777 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6