Question: VCF file lacks alternative allele listing for most rs ids
0
gravatar for devenvyas
4.9 years ago by
devenvyas570
Stony Brook
devenvyas570 wrote:

I have SNP data on 64 samples from my population of interest (~330,000 SNPs per sample using the HumanCNV370-Quad).

I sorted and filtered the published Altai Neanderthal and Denisovan VCF files (http://cdna.eva.mpg.de/neandertal/altai/AltaiNeandertal/VCF/ and http://cdna.eva.mpg.de/denisova/VCF/hg19_1000g/) down to only the rs#s found on my SNP data.

I then noticed a problem where in well over half of the SNPs for the Neanderthal VCF and a small percentage for the Denisovan VCF that the alternative base is not listed... When I go look up those SNPs in dbSNP or in the Denisovan VCF file, alternate alleles exist and are listed... Luckily, it seems that whenever the alt allele is not listed, they are always homozygous for the ref allele

Since these are ancient DNA calls, I will have filter out some types of substitutions, but I can't do that if the. I was wondering, how do I fix this?

Also, I plan on using vcf-isec to intersect the two files, I was wondering, how will the incongruous alt allele information affect this? Thanks!

-Deven

 

 

 

snp vcftools dbsnp vcf • 1.8k views
ADD COMMENTlink written 4.9 years ago by devenvyas570

Can anyone assist with this? Why does this happen. I've ~6K snps with no alternate allele(s) on CHR22 for one of the callings I did. GrCH37, Homo Sapiens, Illumina hiseq 150 bp.

ADD REPLYlink written 7 weeks ago by Bioinformatics_NewComer320
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 764 users visited in the last hour