Question: Problem with loacting gtf file
0
gravatar for elisheva
14 months ago by
elisheva70
Israel
elisheva70 wrote:

Hi everyone!
I have this expression file (UCSC genes, hg19):

tracking_id FPKM    FPKM    0   FPKM_status FPKM_status
DDX11L1 0.0220335   0.014392    0.0182127   OK  OK
WASH7P  0.325992    0.242878    0.284435    OK  OK
MIR6859-1   0   0   0   OK  OK
FAM138A 0.00576753  0.00565091  0.00570922  OK  OK
OR4F5   0   0   0   OK  OK
LOC729737   0.11037 0.134682    0.122526    OK  OK
LOC100132287    0   0   0   OK  OK
LOC100133331    0   0   0   OK  OK
OR4F29  0   0   0   OK  OK
MIR6723 0   0   0   OK  OK
OR4F29  0   0   0   OK  OK

I tried get all these genes sequences, but couldn't find then.
I downloaded a GTF file of UCSC genes from the table browser.
But the genes names are different and there are no "genes" in the file at all.
Only : CDS, exon, start_codon, 4 stop_codon
The GTF file I got looks like:

chr1    hg19_knownGene  exon    11874   12227   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3"; 
chr1    hg19_knownGene  exon    12613   12721   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3"; 
chr1    hg19_knownGene  exon    13221   14409   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3";

Does anybody know where can I find proper file?

ucsc gene gtf • 404 views
ADD COMMENTlink modified 14 months ago by genomax62k • written 14 months ago by elisheva70
1
gravatar for toralmanvar
14 months ago by
toralmanvar750
toralmanvar750 wrote:

Hello,

Looking at the expression file, it seems that you have obtained it after using reference transcriptome pipeline like tophat-cufflinks. Your expression file also have Gene symbols in tracking_id column, which means you have used genome annotation file (.GTF) along with the reference genome fasta file while mapping.

Then why you are not using genome annotation file (.GTF) used during mapping to extract gene cordinates (based on Tracking ID) which can further be used for fetching gene sequences from reference genome using bedtools?

ADD COMMENTlink written 14 months ago by toralmanvar750
0
gravatar for genomax
14 months ago by
genomax62k
United States
genomax62k wrote:

You can generate an annotation file in GTF format for UCSC hg19 genome by following the directions in this post.

ADD COMMENTlink written 14 months ago by genomax62k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2033 users visited in the last hour