Question: What reference build are affymetrix 5.0 arrays?
gravatar for liangx
2.1 years ago by
liangx0 wrote:

Hello, I have some plink data from affymetrix 5.0 arrays and I am converting them to vcfs for imputation. However, I am running into a problem where my vcf files are not matching my reference files. The references I have are the GRch37 fasta and GRch38 primary assembly. Is there a way for me to find out the reference build? am getting 75.5% mismatch when I run bcftools' fixref plugin.

affymetrix reference build • 705 views
ADD COMMENTlink modified 12 months ago by freeseek120 • written 2.1 years ago by liangx0

Can you look at some of the sites where there is mismatch to see what may be happening? Also, which array annotation files did you download?

Array probe sequences should be mostly independent of genome-build, as they only refer to sequence; however, the annotation for these probes will change with regard to their genomic base positions.

Annotation files for SNP 5.0 are available here:

ADD REPLYlink written 2.1 years ago by Kevin Blighe69k
gravatar for freeseek
12 months ago by
freeseek120 wrote:

I would advise to use the affy2vcf bcftools plugin to convert the original raw CEL files to VCF rather than trying to convert PLINK data to VCF which is not recommended. Notice that affy2vcf will also allow you to remap the array manifest to GRCh38 or other human genome reference of your choice if you need your genotype data mapped agains the current human genome reference

ADD COMMENTlink written 12 months ago by freeseek120

Instead of refreshing multiple old posts with essentially the same answer I suggest you better make a new Tool post to present your plugin. These kinds of posts typically include a short description of the scientific problem your tool can help answering, a description of the capabilities and maybe a short code example. I am reasonably sure this better helps promoting your tool than adding them to old posts with accepted answers.

Edit: You have now refreshed 10 old posts with links to the same repository over the last days. Please stop doing that. Do yourself a favor and make a Tool post please.

ADD REPLYlink modified 12 months ago • written 12 months ago by ATpoint44k

Thank you for your suggestion ATpoint. The problem is that these posts, despite old, come up in Google searches when using the right keywords and often have no satisfactory answer (such as this one). I just did not want users to end up on these questions and find no solution. I am just trying to help. Maybe the best solution would be to have these old posts deleted.

ADD REPLYlink written 12 months ago by freeseek120

If you create a comprehensive tools post for your software with right keywords covering possible use cases, that should start showing up in the google search as well.

Since some of these old threads are created by users who may no longer visit biostars we may never know if the solution you proposed is working.

ADD REPLYlink modified 12 months ago • written 12 months ago by GenoMax95k

Agreed and feel free to delete these old posts if you think it is impossible to get feedback from the right users. As soon as I will get a bit of feedback myself knowing that the tool satisfies most use cases I will post it as a tool. Thank you ATpoint and genomax for your suggestions! :-)

ADD REPLYlink written 12 months ago by freeseek120
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1714 users visited in the last hour