Question: how to find strand information for a list of SNPs?
gravatar for miaowzai
2.9 years ago by
United States
miaowzai230 wrote:

I have a list of SNPs from a small GWAS data. But I don't have their strand information. What I have is the chromosome number, posistion, and alleles. I also have their rs numbers.

In order to do imputation, I need the strand information of each of the SNPs, i.e. which strand (+/-) is the recorded SNP coming from. I found a webpage ( that has strand information for some common chips, but unfortunately I don't know what chip they used to get to the results.

I could eventually download the refence genome and write a script to do that. But before that I was wondering if there is any tool already available for this purpose. Thanks!

snp microarray imputation strand • 1.7k views
ADD COMMENTlink modified 2.9 years ago by Emily_Ensembl21k • written 2.9 years ago by miaowzai230

I don't understand why you need the strand information, or even care what the base is. Why not just pretend they are all on the plus strand? It won't affect positional association.

ADD REPLYlink written 2.9 years ago by Brian Bushnell17k

I don't know every detail of the imputation algorithm, but I think in order to impute more variants, the imputation tool needs to know the exact haplotype of each sample so that it can compare the sample haplotype with the reference genome and calculate the probabilities of a SNP in the unsequenced region. I guess that's why both imputation tools IMPUTE2 and SHAPEIT ask for strand information as input. I could be wrong. Do you have any idea? Thanks.

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by miaowzai230

Downloading ref genome .fasta and checking ref strand will certainly work, and can be done quickly. C/G or A/T SNPs may be ambiguous. Checking your alleles against dbSNP may be an option (they maintain a VCF), I'm not aware of a batch lookup tool that would do that for you without at least some scripting, would be interested to know of one. Also see the thread: assign each SNP a strand information for a samtools example of ref base lookup.

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Ahill1.8k
gravatar for Emily_Ensembl
2.9 years ago by
Emily_Ensembl21k wrote:

Usually, if people haven't specified the strand, they mean positive strand. Or, you could just ask the people who did the study.

ADD COMMENTlink written 2.9 years ago by Emily_Ensembl21k

Thanks! I checked for a few SNPs and they are not all on the positive strand so I believe there's a strand information, but my data provider didn't give me this information. I will ask them.

ADD REPLYlink written 2.9 years ago by miaowzai230
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 695 users visited in the last hour