How to get all the versions of refSeq
1
0
Entering edit mode
7.5 years ago
talmud.k • 0

I have downloaded some databases to make an annotation of my sequencing results. But I found that sometimes there is only the HGVS description of some variations rather than the genomic positions. When I want to extract the positions of them in the refGene (downloaded from UCSC), I found that there have no version numbers of the gene IDs. The description in my database is like "NM_144631.5:c.1015T>C", and the ID in refGene is just "NM_144631". I check them in the NCBI website, it's really the latest version. While I doubt that there maybe have some old data that corresponds to the older version in my databases. So how can I get all the versions of the refSeq data, containing the information about the position, cDNA, and exon regions (just like that in the refGene).

gene • 3.3k views
ADD COMMENT
0
Entering edit mode

Im looking for the same thing, i only found the file for the current assembly... does this help any

ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/RefSeqGene/LRG_RefSeqGene

ADD REPLY
0
Entering edit mode
6.2 years ago
Reece ▴ 310

Consider the Python hgvs package. [Disclosure: I'm one of the authors.]

hgvs uses UTA, the Universal Transcript Archive, which is a collection of transcript versions from multiple sources with alignments to multiple references.

ADD COMMENT

Login before adding your answer.

Traffic: 2915 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6