Question: GTF to match transcriptome data in TopHat
0
gravatar for blur
8 weeks ago by
blur110
European Union
blur110 wrote:

Hi,

I am trying to do risobome profilings, and to that end I have been trying to align reads to the human transcriptome with no success. My aligner is TopHat - The transcriptomic hg19 data fasta file looks like this:

>uc001aaa.3
cttgccgtcagccttttctttgacctcttc

I need a GTF file for the run (at least I assume that I do?) but the GTF file that I have downloaded from UCSC looks like this:

chr1    hg19_knownGene  exon    11874   12227   0.000000    +   .   gene_id "uc001aaa.3"; transcript_id "uc001aaa.3";

Upon running I get an error msg -

2020-02-04 15:15:08] Building Bowtie index from ucsc_hg19.fa
    [FAILED]

Looking thorough the posts here I think that the problem is that my GTF does not match the transcriptome. I have tried to figure out if I can get a transcriptomic GTF from UCSC, and I couldn't find any data. Or am I doing this the wrong way and should have used a differently built reference? I have downloaded the genomic data that includes the protein coding genes. The names in this file look like this:

>hg19_knownGene_uc001aaa.3 range=chr1:11874-14409 5'pad=0 3'pad=0 strand=+ repeatMasking=none
cttgccgtcagccttttctttgacctcttctttctgttcatgtgtatttg

The end goal is to only look the transcriptome data, do RPKM and check expression.

Any help would be greatly appreciated,

transcriptome tophat gtf • 72 views
ADD COMMENTlink written 8 weeks ago by blur110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2281 users visited in the last hour