If you would like to "edit" your UCSC Table Browser obtained GTF file, we have provided some utilities to do so:
http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format
The basic gist is to download your table of interest, chop off some columns (may or may not be necessary depending on the specific table), then run the genePredToGtf utility:
$ mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -N -e "select * from refGene" hg19 | \
cut -f2- | genePredToGtf -source=hg19.refGene.ucsc file stdin stdout
Change stdout to the output filename you want in the last command to get an hg19 refGene GTF file:
chr1 hg19.refGene.ucsc transcript 11869 14362 . + . gene_id "LOC102725121"; transcript_id "NR_148357"; gene_name "LOC102725121";
chr1 hg19.refGene.ucsc exon 11869 12227 . + . gene_id "LOC102725121"; transcript_id "NR_148357"; exon_number "1"; exon_id "NR_148357.1"; gene_name "LOC102725121";
chr1 hg19.refGene.ucsc exon 12613 12721 . + . gene_id "LOC102725121"; transcript_id "NR_148357"; exon_number "2"; exon_id "NR_148357.2"; gene_name "LOC102725121";
chr1 hg19.refGene.ucsc exon 13221 14362 . + . gene_id "LOC102725121"; transcript_id "NR_148357"; exon_number "3"; exon_id "NR_148357.3"; gene_name "LOC102725121";
chr1 hg19.refGene.ucsc transcript 11874 14409 . + . gene_id "DDX11L1"; transcript_id "NR_046018"; gene_name "DDX11L1";
...
If you have further questions about UCSC data or tools feel free to send your question to one of the below mailing lists:
- General questions: genome@soe.ucsc.edu
- Questions involving private data: genome-www@soe.ucsc.edu
- Questions involving mirror sites: genome-mirror@ose.ucsc.edu
ChrisL from the UCSC Genome Browser