Get variant position knowing the gene name, base change, and NM#
1
0
Entering edit mode
3.3 years ago
bisansamara ▴ 10

I have a list of > 45 variants. I need to get the exact position of those variants (i.e. chr#:start-end) knowing the gene name affected, base change, and refseqID (NM#) for each variant. Below are few lines as an example of my data:

Gene----------BaseChange----------refseqID

MAN1B1------c.1897G >T------------NM_016219

CRY1----------c.272G >A-------------NM_004075

Some people suggested using MutationTaster to do this, but I couldn't figure out how to use it. Any suggestion on how to use MutationTaster, or any other method to get the variants positions would be highly appreciated.

gene variant genome coordinates • 906 views
1
Entering edit mode
3
Entering edit mode
3.3 years ago

use transvar. Convrt your first and second columns to some thing like this:MAN1B1:c.1897G >T and use transvar.

0
Entering edit mode

This is super easy! Thanks.

0
Entering edit mode

Hello bisansamara,

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.

fin swimmer

0
Entering edit mode

if you can install transvar on your machine, it would be much easy:

$tail -n+2 test.txt |transvar canno -l - -g 1 -m 2 --refseq | cut -f1,2,5 | awk '{gsub("[|/(]","\t")}1' |cut --complement -f3,5,7 | tail -n+2 MAN1B1 c.1897G>T XM_006716945 chr9:g.137107663G>T p.V633L MAN1B1 c.1897G>T NM_016219 chr9:g.137108388G>T p.V633F CRY1 c.272G>A NM_004075 chr12:g.107005244C>T p.W91*  input: $ cat test.txt
Gene    BaseChange  refseqID
MAN1B1  c.1897G>T   NM_016219
CRY1    c.272G>A    NM_004075


If you want only for the transcripts in the list:

\$ tail -n+2 test.txt |transvar canno -l - -g 1 -m 2 -t3 --refseq | cut -f1,2,5 | awk '{gsub("[|/(]","\t")}1' |cut --complement -f3,5,7 | tail -n+2

MAN1B1  c.1897G>T   NM_016219   chr9:g.137108388G>T p.V633F
CRY1    c.272G>A    NM_004075   chr12:g.107005244C>T    p.W91*