Hello,
I am attempting to do htseq-count for an RNA-Seq differential expression data analysis and I have run into an error with the GFF file I am using. The exact error is as follows:
Error occured when processing GFF file (line 3497 of file /Users/*.gff):
start is larger than end
[Exception type: ValueError, raised in _HTSeq.pyx:64]
I looked specifically at line 3497 and the start position is larger than the stop position which makes sense why there is an error, but I am not sure how to fix it. Is there a way to just omit that line of the .gff file in terminal? Any advice would be greatly appreciated.
Thank you,
Evan
Thats what I was hoping to do, but how would you go about doing that? I tried opening up the gff file in excel, but it messed with the overall format of the gff file
Open it in a text editor (think wordpad or notepad on Windows or Editor on a Mac). You should NEVER use Excel in bioinformatics.
Thank you for the tip, that made it possible to edit the file without error. Another problem that I have run into though is that there are way too many places in the gff file where the "start is larger than end" to edit them all by hand. I looked into the options for HTSeq-count, but it didn't seem like any of them would prevent the error.