Bi-allelic SNP filtering
0
0
Entering edit mode
4.3 years ago

Hi all, I am trying to get bi-allelic snps from my vcf file. I have a combined vcf file for 8 samples having tetraploid genomes. For getting bi-allelic markers I am using following command. bcftools view -m2 -M2 -v snps input.vcf.gz

After filtering I am getting some weird marker entries as well such as,

Chr01   431     .       CTTTTTTGGTGA    GTTTTTTGGTGA    40.56   AB=0.166667;ABP=43.5445;AC=3;AF=0.0833333 ;TYPE=snp;technology.ILLUMINA=1   GT:AD:AO:DP:QA:QR:RO    0/0/0/0:16,0:0:29:0:585:16      0/0/0/0:18,0:0:20:0:683:18      0/0/0/0:10,0:0:11:0:375:10      0/0/0/0:7,0:0:10:0:276:7        0/0/0/0:9,0:0:10:0:328:9        0/0/0/1:9,3:3:14:113:342:9      0/0/0/1:12,3:3:18:104:425:12    0/0/0/1:7,1:1:10:32:263:7       0/0/0/0:3,0:0:3:0:105:3

Chr01   1407    .       TTAC    TTAT    421.02  .       AB=0.0991561;ABP=664.531;AC=9;AF=0.25 TYPE=snp;technology.ILLUMINA=1 GT:AD:AO:DP:QA:QR:RO    0/0/0/1:59,5:5:68:158:2112:59   0/0/0/1:68,7:7:76:255:2543:68
   0/0/0/1:24,7:7:36:263:901:24    0/0/0/1:74,4:4:92:135:2927:74   0/0/0/1:41,5:5:48:167:1569:41   0/0/0/1:34,10:10:47:349:1187:34 0/0/0/1:44,3:3:51:97:1634:44    0/0/0/1:16,2:2:22:74:568:16     0/0/0/1:27,4:4:34:156:1028:27

Chr01   3006    .       CATTTTTTCCA     CATTTTTGCCA     20.57   .  AB=0.115385;ABP=69.8248;AC=4;AF=0.111111 YPE=snp;technology.ILLUMINA=1    GT:AD:AO:DP:QA:QR:RO    0/0/0/1:16,3:3:22:123:593:16    0/0/0/0:9,0:0:10:0:329:9        0/0/0/1:18,1:1:20:41:670:18     0/0/0/1:2,1:1:5:41:79:2 0/0/0/0:12,0:0:14:0:454:12      0/0/0/1:4,1:1:5:37:149:4        0/0/0/0:17,0:0:20:0:643:17      0/0/0/0:11,1:1:12:11:399:11     0/0/0/0:8,0:0:12:0:286:8

Chr01   7324    .       TTTT    TTTC    2751.9  .       AB=0.392027;ABP=33.4903;AC=15;AF=0.416667;TYPE=snp;technology.ILLUMINA=1    GT:AD:AO:DP:QA:QR:RO    0/0/0/1:24,6:6:31:188:893:24    0/0/0/1:18,6:6:26:245:632:18    0/0/1/1:8,5:5:14:143:292:8      0/0/1/1:26,36:36:69:1358:959:26 0/0/0/1:20,12:12:34:395:738:20  0/0/1/1:13,10:10:23:377:485:13  0/0/1/1:14,10:10:26:365:525:14  0/0/1/1:24,15:15:41:529:872:24  0/0/1/1:16,18:18:37:676:590:16

These variants have high quality scores compared to simple A/T and C/G variants. My questions are i) Are the aforementioned variants are bi-allelic and bcf-tools does filtering for bi-allelic variants as well?

bcftools bi-allelic snps tetraploid genome • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 2565 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6