Entering edit mode
3.8 years ago
rupjandu
•
0
Hello everyone, I am using the web-based Galaxy tool, not the command line version. I merged FASTA files into one and I'm trying to construct a BLAST database with these local sequences through the makeblastdb function. I get an error message that reads, "Error: Duplicate seq_ids are found: GNL|BL_ORD_ID|9650923".
Can anyone assist in finding a way to remove the duplicate seq IDs using the web-based Galaxy tool preferentially?
Thank you!
If you need Galaxy specific assistance please post this on their help forum: https://help.galaxyproject.org/
sed -nr '/^>/p' <input.fa> |sort -V | uniq -D | uniq -con download file (input.fa). This should print duplicated/identical headers and their count.seqkittool and runseqkit rename -n <input.fa> -o <output.fa>. This would generate a new fileoutput.faand append numbers serially at the end of fasta IDs/headers if they are identical.sed -nr '/^>/p' <output.fa> | sort -V | uniq -D | uniq -con new file (output.fa). This should not print any line.