Question: [Prokaryote] How to go from reads to counts given that there is no exon column in gtf file?
gravatar for akibio
7 days ago by
akibio0 wrote:

I am trying to go from raw reads to counts and then to TPM/TMM values of gene expression for a prokaryotic organism (via mapping the RNA sequencing reads to the reference genome). I have read that an annotation file (gtf or gff3) is needed, and encountered this issue firsthand when STAR threw an error saying that my gtf file doesn't have any exon lines.

My question is, how should I go about this process of mapping reads to counts and then to TPM or TMM, given that I can't find a gtf file with exon lines? I am open to using any of the reputable alignment packages e.g. I've heard of Bowtie2 and STAR. I should mention that the gff3 file does have exon lines, but I can't understand if STAR will be happy to use this file.

The exact error that STAR throws is this:

Fatal INPUT FILE error, no exon lines in the GTF file: /Users/fastq/gtf_file.gtf
Solution: check the formatting of the GTF file, it must contain some lines with exon in the 3rd column.
          Make sure the GTF file is unzipped.
          If exons are marked with a different word, use --sjdbGTFfeatureExon .
rna-seq alignment • 82 views
ADD COMMENTlink modified 7 days ago by Juke344.8k • written 7 days ago by akibio0

You don't have to use STAR per se since you are not looking for a splice aware aligner. So you could align with any aligner and then use the SAF (simple annotation format) for featureCounts to do read counting.

ADD REPLYlink written 7 days ago by genomax91k

Thanks, this then may be a silly question but does featureCounts require the exon column?

Also, I initially chose to use STAR since it is provably far faster than any other aligner, however I wonder if prokaryotic organisms ever see this benefit.

ADD REPLYlink written 7 days ago by akibio0

See the link included in my comment above for an explanation of SAF format. Simple answer is no. You can make up a file in SAF format yourself by choosing gene names (chromosome would be one in your case unless you have plasmids), gene start and stops.

There are plenty of other aligners that are fast. bwa mem, would fit the bill.

ADD REPLYlink modified 7 days ago • written 7 days ago by genomax91k
gravatar for Juke34
7 days ago by
Juke344.8k wrote:

You can use from AGAT , it will re-create the exon features

ADD COMMENTlink written 7 days ago by Juke344.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1719 users visited in the last hour