Question: How to download mm10 GTF file with the gene id and gene name using UCSC table browser?
1
gravatar for John
3 days ago by
John180
United States
John180 wrote:

Hi, what is the parameters I should put to download the same format GTF file like the first line of GTF file below, for mm10 ?

chr1    unknown exon    3214482 3216968 .   -   .   gene_id "Xkr4"; gene_name "Xkr4"; p_id "P14345"; transcript_id "NM_001011874"; tss_id "TSS25485";

I can download this format using the following parameters for mm9 but not for mm10!!!

Assembly: mm9
Group: Gene and Gene prediction tracks; 
Track: RefSeq genes; 
Table: refFlat
Output format: GTF

Thanks

ucsc rna-seq alignment • 200 views
ADD COMMENTlink modified 2 days ago by Luis Nassar270 • written 3 days ago by John180
2
gravatar for Luis Nassar
2 days ago by
Luis Nassar270
UCSC Genome Browser
Luis Nassar270 wrote:

Hello,

Short answer: http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/mm10.refGene.gtf.gz

Long answer:

Due to the way the Table Browser forms queries, the Table Browser GTF output repeats the gene_id and transcript_id fields as such:

chr1    mm9_refFlat stop_codon  3206103 3206105 0.000000    -   .   gene_id "Xkr4"; transcript_id "Xkr4"; 

This is why we denote that output as "GTF (limited)". We have a wiki page for how to accomplish this properly (http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format) which comes down to using a separate utility for the conversion. Another reason this may have been confusing, is you did not see the same reFlat table available on the Table Browser. This is because in mm10/hg19/hg38, NCBI started releasing coordinates along with their annotation sequences. This means that to get the equivalent of your selection for mm10, you would use the following:

Assembly: mm9
Group: Gene and Gene prediction tracks; 
Track: NCBI RefSeq; 
Table: UCSC RefSeq (refGene)
Output format: GTF (limited)

Like refFlat, these are our own alignments of the NCBI sequences. However, due to the limited output you will not have the gene name (included in refFlat) unless you follow the wiki conversion.

We also have begun to offer these proper GTF files in our downloads directory. Here it is for mm10: http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/

The equivalent you will want to use will be http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/genes/mm10.refGene.gtf.gz

If you have further questions, you can reach us at genome@soe.ucsc.edu. It may take us a little longer to answer questions on biostars.

ADD COMMENTlink written 2 days ago by Luis Nassar270
2

Hi Luis, What about the human? Can you share the gtf link for hg19 and hg38?

ADD REPLYlink modified 2 days ago • written 2 days ago by Shicheng Guo8.0k
2

Yes, we are still in the process of making them available for all of our assemblies.

hg38 GTFs: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/

hg19 GTFs: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/genes/

ADD REPLYlink modified 2 days ago • written 2 days ago by Luis Nassar270
0
gravatar for badribio
3 days ago by
badribio240
badribio240 wrote:

Like this?

ADD COMMENTlink modified 3 days ago by Emily_Ensembl20k • written 3 days ago by badribio240

I can't see anything! thanks

ADD REPLYlink written 3 days ago by John180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1518 users visited in the last hour