I have some paired end fastq files supposedly originating directly from illumina. But they contain some number of records with duplicate names (but different sequences) on which MergeBamAlignment swears. So i need a tool to remove all such duplicates. I saw the advice to use seqtk Duplicate/identical reads in fastq file , but seqtk leaves one of the duplicates untouched. Which may lead to wrong results cos there is no guarantee that it will leave two reads from the same pair.
Is there a tool that removes all reads that have duplicate names?
Thanks in advance.