I have reads from a eukariotic genome and there duplicate due to sequencing. In a traditional enviroment, I would align them, mark and remove duplicates but here, I have no reference.
I am wondering, is there any software that does duplicate removal of raw sequence data ? What is your experience with them ?
Sorry in advance if the question is naive.