how to find strand information for a list of SNPs?
1
1
Entering edit mode
6.7 years ago
miaowzai ▴ 390

I have a list of SNPs from a small GWAS data. But I don't have their strand information. What I have is the chromosome number, posistion, and alleles. I also have their rs numbers.

In order to do imputation, I need the strand information of each of the SNPs, i.e. which strand (+/-) is the recorded SNP coming from. I found a webpage (http://www.well.ox.ac.uk/%7Ewrayner/strand/) that has strand information for some common chips, but unfortunately I don't know what chip they used to get to the results.

I could eventually download the refence genome and write a script to do that. But before that I was wondering if there is any tool already available for this purpose. Thanks!

strand microarray SNP imputation • 3.1k views
ADD COMMENT
0
Entering edit mode

I don't understand why you need the strand information, or even care what the base is. Why not just pretend they are all on the plus strand? It won't affect positional association.

ADD REPLY
1
Entering edit mode

I don't know every detail of the imputation algorithm, but I think in order to impute more variants, the imputation tool needs to know the exact haplotype of each sample so that it can compare the sample haplotype with the reference genome and calculate the probabilities of a SNP in the unsequenced region. I guess that's why both imputation tools IMPUTE2 and SHAPEIT ask for strand information as input. I could be wrong. Do you have any idea? Thanks.

ADD REPLY
0
Entering edit mode

Downloading ref genome .fasta and checking ref strand will certainly work, and can be done quickly. C/G or A/T SNPs may be ambiguous. Checking your alleles against dbSNP may be an option (they maintain a VCF), I'm not aware of a batch lookup tool that would do that for you without at least some scripting, would be interested to know of one. Also see the thread: assign each SNP a strand information for a samtools example of ref base lookup.

ADD REPLY
2
Entering edit mode
6.7 years ago
Emily 23k

Usually, if people haven't specified the strand, they mean positive strand. Or, you could just ask the people who did the study.

ADD COMMENT
0
Entering edit mode

Thanks! I checked for a few SNPs and they are not all on the positive strand so I believe there's a strand information, but my data provider didn't give me this information. I will ask them.

ADD REPLY

Login before adding your answer.

Traffic: 2627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6