Question: Renaming SNPs or SNP matching
2
gravatar for Ryan D
5.0 years ago by
Ryan D3.3k
USA
Ryan D3.3k wrote:

This should be easy to do by now, but... we have SNP data from an Illumina exome array given to us in PLINK format. The BIM file looks like this:

1       exm2253575      0       881627  G       A
1       exm269  0       881918  A       G
1       exm340  0       888659  T       C
1       exm348  0       889238  A       G
1       exm2264981      0       894573  G       A
1       exm773  0       909238  G       C
1       exm782  0       909309  C       T
1       exm912  0       949608  A       G
1       exm991  0       977028  T       G
1       exm1024 0       978762  A       G

And I have all of the SNPs in dbSNP 138  downloaded as a large VCF file:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
1       10019   rs376643643     TA      T       .       .       RS=376643643;RSPOS=10020;dbSNPBuildID=138;SSR=0;SAO=0;VP=0x050000020001000002000200;WGT=1;VC=DIV;R5;OTHERKG
1       10054   rs373328635     CAA     C,CA    .       .       RS=373328635;RSPOS=10055;dbSNPBuildID=138;SSR=0;SAO=0;VP=0x050000020001000002000210;WGT=1;VC=DIV;R5;OTHERKG;NOC
1       10109   rs376007522     A       T       .       .       RS=376007522;RSPOS=10109;dbSNPBuildID=138;SSR=0;SAO=0;VP=0x050000020001000002000100;WGT=1;VC=SNV;R5;OTHERKG
1       10139   rs368469931     A       T       .       .       RS=368469931;RSPOS=10139;dbSNPBuildID=138;SSR=0;SAO=0;VP=0x050000020001000002000100;WGT=1;VC=SNV;R5;OTHERKG
1       10144   rs144773400     TA      T       .       .       RS=144773400;RSPOS=10145;dbSNPBuildID=134;SSR=0;SAO=0;VP=0x050000020001000002000200;WGT=1;VC=DIV;R5;OTHERKG
1       10146   rs375931351     AC      A       .       .       RS=375931351;RSPOS=10147;dbSNPBuildID=138;SSR=0;SAO=0;VP=0x050000020001000002000200;WGT=1;VC=DIV;R5;OTHERKG

I want to match them up so that each SNP in the BIM is identified from the VCF file. This is mostly for renaming them with proper dbSNP names. I have been trying to match them by formatting them as BED files and using BEDTOOLS while restricting to SNPs that are SNVs. The problem is that there are some SNPs with the same chr/start positions. Is there an easy way to rename or identify the SNPs by including allele information with VCFTOOLS, BEDTOOLS, PLINK, or another common tool? I get matching for about 99% using BEDTOOLS and command-line options, but there must be an easiest or standard way to get this right.

Thanks,

Ryan

 

snp bim exome-chip bedtools vcf • 2.1k views
ADD COMMENTlink modified 2.9 years ago by vakul.mohanty240 • written 5.0 years ago by Ryan D3.3k
0
gravatar for jlwebb
3.0 years ago by
jlwebb20
jlwebb20 wrote:

Did you figure this out?

ADD COMMENTlink written 3.0 years ago by jlwebb20
0
gravatar for vakul.mohanty
2.9 years ago by
vakul.mohanty240
United States
vakul.mohanty240 wrote:

It would be easier to use a annotator like ANNOVAR.

ADD COMMENTlink written 2.9 years ago by vakul.mohanty240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 791 users visited in the last hour