I'm trying to sort a big FASTQ file by read name (which looks like something not exotic). I managed with seqkit but this doesn't scale well for a big file (memory crash). Many approaches ranked
although I have to match this with its corresponding BAM that was "correctly" sorted by samtools sort. My guess is that, since 10000 has one more digit than 1720, it comes first (probably because a digit comes before a colon). I had this results with a bash solution based on sort and BBmap for example. I could code it myself (like sorting each number between the colons) but I'm pretty astonished this doesn't exist. Any hint?