2 days ago by
UCSC Genome Browser
Short answer: http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/mm10.refGene.gtf.gz
Due to the way the Table Browser forms queries, the Table Browser GTF output repeats the gene_id and transcript_id fields as such:
chr1 mm9_refFlat stop_codon 3206103 3206105 0.000000 - . gene_id "Xkr4"; transcript_id "Xkr4";
This is why we denote that output as "GTF (limited)". We have a wiki page for how to accomplish this properly (http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format) which comes down to using a separate utility for the conversion. Another reason this may have been confusing, is you did not see the same reFlat table available on the Table Browser. This is because in mm10/hg19/hg38, NCBI started releasing coordinates along with their annotation sequences. This means that to get the equivalent of your selection for mm10, you would use the following:
Group: Gene and Gene prediction tracks;
Track: NCBI RefSeq;
Table: UCSC RefSeq (refGene)
Output format: GTF (limited)
Like refFlat, these are our own alignments of the NCBI sequences. However, due to the limited output you will not have the gene name (included in refFlat) unless you follow the wiki conversion.
We also have begun to offer these proper GTF files in our downloads directory. Here it is for mm10: http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/
The equivalent you will want to use will be http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/mm10.refGene.gtf.gz
If you have further questions, you can reach us at firstname.lastname@example.org. It may take us a little longer to answer questions on biostars.