I am seeking a way to correct the formats for inputs to FeatureCounts.
I have sourced data from a resource for sweet potato genome annotations, and aligned my RNA-seq data to the available reference genome. However, it seems FeatureCounts cannot report on the SAM files I input.
The reference genome FASTA file does not have chromosomes clearly in the headers. I believe this to be the reason for FeatureCounts not working, however, I want to be sure before I go scripting in different directions to correct the files I have.
To be clear, I am providing the genome annotation file along with each of my aligned samples.
Example reference genome sequence header:
The above gene identifier shows that it is on chromosome 11.
Chr11 gt4sp_v3 mRNA 1467 2621 . + . ID=itf01g00010.t1;Name=itf01g00010.t1;Parent=itf01g00010
I am not getting errors from FeatureCounts, but it is possible that Chr11 should be in the FASTA header. Also, I do convert the GFF3 file to GTF so that FeatureCounts has a GTF as input.