10 months ago by
- Do the paper authors tell you which strand they've quoted their alleles on?
- Look up the variants in public databases. Ensembl always quotes the forward strand alleles. If the paper talks about G/T and the database talks about C/A, then they've given the reverse strand alleles – flip them. Unfortunately if the paper talks about A/T and the database talks about T/A, then you can't always tell if they've used reverse strand, or just flipped reference and alternative.
- Look at the gene the paper discussed with reference to the variant. If they are talking about a reverse strand gene, they'll usually be talking about reverse strand variant alleles.
- If it's still not clear, email the paper authors to ask.
This is not an easy problem, because we're talking about human behaviour, rather than proper data conventions. This is why the GWAS catalog (and we at Ensembl when we import their data) have a policy of quoting exactly the alleles given in the paper, rather than trying to convert. If we claim to convert then it's expected that we will always convert correctly, whereas if we don't convert then we aren't responsible when we misinterpret a G/C or A/T.