Error in piRNA gtf file while featurecounts step in STAR aligner?
1
0
Entering edit mode
16 months ago

Hello I have a problem during featurecounts step in STAR aligner. I got BAM file through indexing and mapping of my data, but to quantify piRNA counts using featurecounts I'm facing gtf file(exon line missing from 3rd column)error.

WARNING no features were loaded in format GTF.
|| Failed to open the annotation file /home/bioinformatics/Desktop/pirnadb.v1_7_6.hg38.gtf, or its format is incorrect, or it contains no 'exon' features.

When I compared miRNA gtf file wih piRNA i found out that Start-codon column is missing from piRNAdb gtf file. As a alternative step I have checked gff3 file through ht-seq counts tool, which ended up giving same error.

Hope you would suggest me any other way to obtain quality reads of piRNA and a way to download apt gtf file.

alignment R • 547 views
0
Entering edit mode
16 months ago
michael.ante ★ 3.6k

Hi,

It seems, that your GTF doesn't follow the (loose) standards. See e.g. the UCSC format FAQs about GFF2 and GTF.

The easiest way to repair your GTF is to use bioawk like :

bioawk -c gff '{\$feature="exon"; print} ' pirnadb.v1_7_6.hg38.gtf


Nevertheless, the gene_id and transcript_id are missing in the attributes section and need to be included as well. Assuming, tzhe piRNA code is unique, you can use sed to insert the missing ids:

sed 's/piRNA_code $$\"hsa-piR-[0-9][0-9]*\"\;$$/gene_id \1 transcript_id \1/g'


You can pipe these two commands together bioawk ... | sed ... > new_pirnadb.gtf

I hope this will solve the issue.

Cheers,

Michael

0
Entering edit mode

hi, Michael Thank you for your suggestion. I did try your method. But I cant add the Gene_id and Transcripts_id in my output file. Hope you can see through this error.

Thank you in advance. Regards, Geetha.

0
Entering edit mode

Hi Geetha,

Did you receive an error? Can you provide the first couple of lines from your input and output gtf?