Question: Problem with loacting gtf file
0
gravatar for elisheva
9 months ago by
elisheva70
Israel
elisheva70 wrote:

Hi everyone!
I have this expression file (UCSC genes, hg19):

tracking_id FPKM    FPKM    0   FPKM_status FPKM_status
DDX11L1 0.0220335   0.014392    0.0182127   OK  OK
WASH7P  0.325992    0.242878    0.284435    OK  OK
MIR6859-1   0   0   0   OK  OK
FAM138A 0.00576753  0.00565091  0.00570922  OK  OK
OR4F5   0   0   0   OK  OK
LOC729737   0.11037 0.134682    0.122526    OK  OK
LOC100132287    0   0   0   OK  OK
LOC100133331    0   0   0   OK  OK
OR4F29  0   0   0   OK  OK
MIR6723 0   0   0   OK  OK
OR4F29  0   0   0   OK  OK

I tried get all these genes sequences, but couldn't find then.
I downloaded a GTF file of UCSC genes from the table browser.
But the genes names are different and there are no "genes" in the file at all.
Only : CDS, exon, start_codon, 4 stop_codon
The GTF file I got looks like:

chr1    hg19_knownGene  exon    11874   12227   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3"; 
chr1    hg19_knownGene  exon    12613   12721   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3"; 
chr1    hg19_knownGene  exon    13221   14409   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3";

Does anybody know where can I find proper file?

ucsc gene gtf • 330 views
ADD COMMENTlink modified 9 months ago by genomax55k • written 9 months ago by elisheva70
1
gravatar for toralmanvar
9 months ago by
toralmanvar530
toralmanvar530 wrote:

Hello,

Looking at the expression file, it seems that you have obtained it after using reference transcriptome pipeline like tophat-cufflinks. Your expression file also have Gene symbols in tracking_id column, which means you have used genome annotation file (.GTF) along with the reference genome fasta file while mapping.

Then why you are not using genome annotation file (.GTF) used during mapping to extract gene cordinates (based on Tracking ID) which can further be used for fetching gene sequences from reference genome using bedtools?

ADD COMMENTlink written 9 months ago by toralmanvar530
0
gravatar for genomax
9 months ago by
genomax55k
United States
genomax55k wrote:

You can generate an annotation file in GTF format for UCSC hg19 genome by following the directions in this post.

ADD COMMENTlink written 9 months ago by genomax55k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1526 users visited in the last hour