Entering edit mode
7.7 years ago
malik.yousef
▴
20
Hello
Given Two Fasta Files(DNA)-How to remove duplicated sequences(most similar 90%-80%)? Or keep one of them at the first file. Which tools to use and how to performa that?
Best Malik
Thanks for your reply. I cant run it as i'm using cygwin and getting the fellowing error: -bash: ./seqkit: cannot execute binary file: Exec format error
You can run BBMap on a PC. Pure java, no cygwin needed.
You can download the Windows version ~~~ NO ANY dependencies
seqkit_windows_386.exe.tar.gz or seqkit_windows_amd64.exe.tar.gz
Ok i have it in Windows and its ok. Still i prefer to run it in CygWin...What i should do?
Well, it seems that golang could not compile cygwin executable binaries. Both linux and windows, mac os x are supported, but cygwin :(
ok..so this SeqKit rmdup -remove duplicated sequences..how to remove sequences with similarity of let say 90% and above?
Try USEARCH, VSEARCH or CD-HIT other clustring softwares.