Hello,
I am trying to use MCL to do paralog clustering of 13 genomes for comparative genomics. I have been using the protocol "Using MCL to Extract Clusters from Networks" as a reference for doing this. In the protocol, it says that I have to run blastall -p blastp
with the -m8
parameter on my protein fasta files before I can run MCL. It states that there are instructions on how to do this in the supplementary material, but I have not been able to find them. I am confused on how to do this step as it is has not worked for me so far. How do I do this so I can move on to running MCL? Thank you in advance.
-Brittany
Thank you for your response! It was not working because of how I formulated the command. Do I run each of the 13 protein fasta files separately? And also, what database am I supposed to be blasting them to in order to run MCL?
From what I understand, your goal is to cluster paralogs from 13 different genomes. That means concatenating all genomes together, doing all-vs-all blast search, and finally MCL clustering.
Concatenation (typing only 3 genome names):
Formatting BLAST (not BLAST+, though it may work with it) database:
BLASTing:
After that follow the protocol from my previous message. You may need to increase the
E-value cutoff
from 1e-5 - not sure what is appropriate for paralog detection.Thank you so much! Your reply was very helpful!