Hi.
I'm trying to use RSEM to calculate gene expression of my RNA-Seq experiment. I have assembled the reads with IDBA-UD and got the transcripts with Prodigal. So, i have a fasta with the contigs from IDBA-UD, another fasta with the transcripts from Prodigal and, also, a GFF generated by Prodigal. I tried using the GFF file in RSEM with no succes, then I converted to a GTF file and it's not working as well. My last attempt was the following:
rsem-prepare-reference contigs.fasta reference_name --gtf prodigal.gtf --bowtie2
My GTF file looks like this:
contig-100_0 Prodigal_v2.6.3 CDS 3 503 10.6 + 0 gene_id "contig-100_0_1"; transcript_id "contig-100_0_1";
contig-100_0 Prodigal_v2.6.3 CDS 507 776 19.2 + 0 gene_id "contig-100_0_2"; transcript_id "contig-100_0_2";
contig-100_0 Prodigal_v2.6.3 CDS 848 1201 37.4 + 0 gene_id "contig-100_0_3"; transcript_id "contig-100_0_3";
contig-100_0 Prodigal_v2.6.3 CDS 1198 1464 44.3 + 0 gene_id "contig-100_0_4"; transcript_id "contig-100_0_4";
contig-100_0 Prodigal_v2.6.3 CDS 1461 1655 10.7 + 0 gene_id "contig-100_0_5"; transcript_id "contig-100_0_5";
And, finally, the error is this:
Parsed 200000 lines
Parsed 400000 lines
The reference contains no transcripts! failed! Plase check if you provide correct parameters/options for the pipeline!
It worked!
I just created a GTF file with the exon and transcript lines and it worked perfectly!
Thank you, h.mon!