I've tried to use STAR/STARlong to map my PacBio Iso-Seq reads (kinnex) with no success.
This is the error I get:
EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
My files/reads are not corrupted or truncated, and I'm using the raw data. I have checked the correctness of my files in many different ways (including fastQValidator). I've tried mapping them both gzipped and unzipped, and also a subsample, but I get the same error. STAR doesn't like my reads. Minimap2 works well with them, I ran it with no issues.
So, my questions are:
Is there any real difference between STAR and STARlong? I ran both, but got the same error, and the manual seems to be exactly the same. I ran it using both default and custom parameters, but I still got the same error.
Have you had the same issue? Have you found a solution?
Are there any other mappers you can recommend for PacBio Iso-Seq? The intron SAM flags are very important to me, that's why I'm insisting on using STAR.
What is the average length of your reads? That error sounds like a bug, so you may want to post that on
STAR
repo as an issue.I assume you have seen https://isoseq.how/
The average is 1.8kb. I was planning to post it on GitHub, but it seems like the repository has been unattended for the last months, so I decided to post it here instead.
You could give
mapPacBio.sh
from BBMap suite a try while you wait for other answers. That works for reads up to 6 kb. BBMap has a new home on the web: https://bbmap.org/Thanks, I'll give it a try.