Refseqgene Directly To Gtf?
10.2 years ago
Gabe Rudy ▴ 320

UCSC says they take nighlty dumps of the mRNA sequences from RefSeqGens and BLAT them against their assemblies to generate the RefSeqGenes track.

NCBI does have a mappings file on their FTP site:

But this only provides a start/stop position along the assembly, not the CDS start/stop and exon boundries.

Is there any way to take NCBI's FTP data and generate a proper GTF file or another location on NCBI that has this data in one form or another?

9.8 years ago
Gabe Rudy ▴ 320

I finally got an answer from Deanna Church on this in response to a blog post we wrote about variant annotation.

Here is the GRCh37 GFF mappings of RefSeqGenes.

GFF3 can be a bit tricky to parse and get to the same fields as GTF, but all the data is there.

Notice also microRNAs are also in there.


