Tutorial:Easy way to run easily orthoMCL (Copy & paste)
6
14
Entering edit mode
6.6 years ago
Esaie ▴ 160

After spent a lot of time to run the 13 step of orthoMCL I reached running it successfully. The difficulty for me it's not orthoMCL but, to know the usage of each command because the documentation it's not enought clear. Here I present the details of commands with parameter that I have used according to the OrthoMCL User's Guide http://orthomcl.org/common/downloads/software/v2.0/UserGuide.txt.

(1) I using MySql, You need first to create a database name : orthomcl this is my orthomcl.config.template file

 # this config assumes a mysql database named 'orthomcl'.  adjust according

dbVendor=mysql
dbConnectString=dbi:mysql:orthomcl
similarSequencesTable=SimilarSequencesorthomcl
orthologTable=Orthologorthomcl
inParalogTable=InParalogorthomcl
coOrthologTable=CoOrthologorthomcl
interTaxonMatchView=InterTaxonMatchorthomcl
percentMatchCutoff=50
evalueExponentCutoff=-5
oracleIndexTblSpc=NONE


(2) et (3) obvious

(4) in the command line switch to the root of orthomcl file, and run this command, it will create tables in your orthomcl database

orthomclSoftware-v2.0.9$bin/orthomclInstallSchema my_orthomcl_dir/orthomcl.config.template my_orthomcl_dir/install_schema.log  (5) according to orthomcl guide switch to orthomclSoftware-v2.0.9$ cd my_orthomcl_dir/compliantFasta/


and after cow.fasta

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$../../bin/orthomclAdjustFasta cow /home/....../cow.fasta 1 orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclAdjustFasta hsa /home/....../human.fasta 1
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$../../bin/orthomclAdjustFasta mus /home/....../mouse.fasta 1  These commandes will create 3 files in compliantFasta directory name cow.fasta, hsa.fasta ans mus.fasta (6) orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclFilterFasta . 10 20


(7) Step 7: All-v-all BLAST

You need first to create a local database in file for blast

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$makeblastdb -in goodProteins.fasta -dbtype prot -out my_prot_blast_db  and orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ blastall -p blastp -F 'm S' -v 100000 -b 100000 -z 55 -e 1e-5 -d my_prot_blast_db -i goodProteins.fasta -o out.tab -m 8

(8) In my own case, I have create a directory name 'blast' compliantFasta directory in copy cow.fasta, hsa.fasta ans mus.fasta in this diectory by doing

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$mkdir blast  and orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ cp cow.fasta blast
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$cp hsa.fasta blast orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ cp mus.fasta blast


and

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$../../bin/orthomclBlastParser out.tab blast/ >> similarSequences.txt  (9) orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclLoadBlast ../orthomcl.config.template similarSequences.txt


(10)

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$../../bin/orthomclPairs ../orthomcl.config.template ../orthomcl_pairs.log cleanup=no  (11) orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclDumpPairsFiles ../orthomcl.config.template


(12)

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$mcl mclInput --abc -I 1.5 -o mclOutput  (13) orthomclSoftware/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclMclToGroups cluster 1000 < mclOutput > groups.txt


This is what I have done to execute successful.

for those who like me love copy and paste

if you like please just vote, like or comment.

run easy orthomcl all-v-all mcl Tutorial • 15k views
0
Entering edit mode

Hi Durelle. I would like to run orthoMCL too but I am quite confuse with the analysis itself. I am still new to this that's why there were some things that I am confuse a bit. Below are my questions and the error I went through when I tried running the orthoMCL.

1.) DATA. I'm not really sure about the input data in the pre-proccessing stage (orthomclAdjustFasta). I've read in one of the article that the input file is a protein sequence in a fasta format. What I have is my de novo assembled transcriptome. Can I use that in the orthoMCL? I would like to know the orthologs actually. Or, do I need to run Transdecoder then convert the .pep into fasta format and used that as my input file?

2.) I tried using my de novo transcriptome in the orthomclAdjustFasta but i met with an error while running the analysis. And if I can actually used my transcriptomic data, then it would be nice if you can shed some light with this error.

My assembled transcriptome contains >300,000 transcripts. A few of the first transcripts are:

TR1|c0_g2_i1 len=410 path=[443:0-227 459:228-257 457:258-287 448:288-311 449:312-409] [-1, 443, 459, 457, 448, 449, -2

TR2|c0_g1_i1 len=270 path=[292:0-92 293:93-112 294:113-269] [-1, 292, 293, 294, -2]

TR3|c0_g2_i1 len=386 path=[436:0-290 445:291-307 443:308-308 441:309-347 427:348-385] [-1, 436, 445, 443, 441, 427, -2]

TR3|c1_g1_i1 len=254 path=[232:0-253] [-1, 232, -2]

When I did the orthomclAdjustFasta i met with an error stating that TR3 is repeated (or something like that. Please go back to the example above) and thus terminates the run. I am using the ssh command and that's what I got in my .err output.

If I can used my data, what should I do to solve this error? I am thinking of renaming those repeated transcripts but I am afraid if I would jeopardize the result. Would that infer with the blast run? What analysis can you refer to me to edit those repeated transctips as fast as I could? Renaming it individually is very tedious.

0
Entering edit mode

Hello stephravelo7!

I don't know if you have already fixed your problem. You must filter your .pep (protein) file and keep only unique (e.g. longest) isoforms in a fasta format style. File extension doesn't matter as long as it's correctly formatted. That is:

>c0_g2_i1
AAABBBCCC
>c0_g1_i1
AAABBBCCC
>c1_g1_i1
AAABBBCCC
...


That's what you should use as input for orthomclAdjustFasta.

Hope it can help :-)

0
Entering edit mode

Hi,

I am following the instructions and my analysis is running well until in step 9.

orthomclLoadBlast
$orthomclLoadBlast orthomcl.config similarSequences.txt Error: DBD::mysql::st execute failed: Data too long for column 'SUBJECT_ID' at row 1567  ADD REPLY 0 Entering edit mode Hi Santiago, Thank for the answer. Anyway, i ran with this error in orthomclDumpPairsFIles: DBD::mysql::st execute failed: Error writing file '/tmp/MYXqNMk7' (Errcode: 28 - No space left on device) at /home/.../orthomclDumpPairsFiles line 54, <F> line 14.  Do you have any suggestion on how to solve this? Thanks. ADD REPLY 0 Entering edit mode This tutorial was really useful for me! Thanks! ADD REPLY 0 Entering edit mode Great. Thanks for posting your experience ADD REPLY 0 Entering edit mode Thank you very much! It was very useful and easy to follow ADD REPLY 0 Entering edit mode 4.8 years ago hudak7 • 0 When i run orthomclLoadBlast it says my SimilarSequences table is full. Any idea what these means? I have tons of space in my disk space. ADD COMMENT 0 Entering edit mode You can start a new question, these have a higher likelihood of getting answered. Sounds like you still have a previous run that finished orthomclLoadBlast in the MySQL database. If you're sure that's your data then you can skip that step, I usually end up DROPping the whole database before running all OrthoMCL steps, like this, where the mysql username is root: mysql -u root -p DROP DATABASE orthomcl; CREATE DATABASE orthomcl;  Then running everything from the start. ADD REPLY 0 Entering edit mode Hi, I am following the instructions and my analysis is running well until in step 9. orthomclLoadBlast $ orthomclLoadBlast orthomcl.config similarSequences.txt

Error: DBD::mysql::st execute failed: Data too long for column 'SUBJECT_ID' at row 1567

0
Entering edit mode

The name of one or more genes are too long for the database!

Two ways to fix this:

• write a small script (using Bopython or Bioperl) that cuts off the names of your IDs to below 60 characters

• in the OrthoMCL script orthomclInstallSchema, change these two lines:

QUERY_ID                 VARCHAR(60),
SUBJECT_ID               VARCHAR(60),


to something like

 QUERY_ID                 VARCHAR(120),
SUBJECT_ID               VARCHAR(120),


and rerun the script (you will probably encounter other database columns which are too short)

0
Entering edit mode

After checking my similarSequences.txt, I think the problem is not with the no. of characters in the InstallSchema script but with the output generated in blastall following the command above. The no. of database size should be specified in -z command. In the script above, -z in blastall is 55 because that's the database size of its goodProteins.fasta. For you to know the db siz of your own data, I use this command: grep ">" goodProteins.fasta | wc. The first number will give you the database size, which you will need when you run blastall.

0
Entering edit mode

Hi Philipp,

Thanks for the help from my previous question. I meet again with another error in orthomclDumpPairsFiles.

DBD::mysql::st execute failed: Error writing file '/tmp/MYza1d1c' (Errcode: 28 - No space left on device) at /home/.../bin/orthomclDumpPairsFiles line 54, <F> line 14.


Do you have any idea how to solve this? Thanks!

0
Entering edit mode

Your hard-drive is full, delete unnecessary stuff :)

0
Entering edit mode

This is going to be a lame question but how do i check the space of my hd? Anyway, I cant delete files as of the moment so I am wondering if I can change the writing of the file in other folder instead of /tmp/MYza1d1c? I had this folder with 613G, so I use this folder by writing its path in the tmpdir of mysqld.cnf. Apparently, when I restart mysql to take effect the change I made, I got an error:

Job for mysql.service failed because the control process exited with error code. See "systemctl status mysql.service" and "journalctl -xe" for details.


I am really new to this so I am sorry if I'm asking too many questions.

0
Entering edit mode

You can use

df -h


to see how much space you have left on each device.

There's a chance that the mysql process is not allowed to write to the folder with 613G, so look at the rights of that folder using ls -lah

0
Entering edit mode

I am having problem at the step 8. I am getting this message over and over again.

DBD::mysql::st execute failed: The used command is not allowed with this MySQL version at .......orthomclSoftware-v2.0.9/bin/orthomclLoadBlast line 39, <f> line 14.


My command was:

...bin/orthomclLoadBlast .....orthomclSoftware-v2.0.9/my_orthomcl_dir/orthomcl.config.template ....orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta/similarSequences


I have installed mysql-8.0.11-macos10.13-x86_64.dmg in MacOS High Sierra 10.13.6

I have tried to add:

dbConnectString=dbi:mysql:orthomcl:mysql_local_infile=1


to my orthomcl.config.template file. Still not working.

Any idea how to solve this?

Thanks.

0
Entering edit mode

Hi, it's kind off late but it means to me that there's incompatibility between the mysql and orthomcl you used. See in this link (C: OrthoMCL installation on Ubuntu Linux) the orthomcl installation that works from above command. Good luck!

0
Entering edit mode

Hi Philipp,

any idea why one would get such error message on step#4 (orthomclInstallSchema)

orthomclInstallSchema my_orthomcl_dir/orthomcl.config my_orthomcl_dir/install_schema.log


Can't locate DBD/mysql.pm in @INC (you may need to install the DBD::mysql module)

here is my orthomcl.config file:

dbVendor=mysql
dbConnectString=dbi:mysql:mmax_orthomcl
similarSequencesTable=SimilarSequences
orthologTable=Ortholog
inParalogTable=InParalog
coOrthologTable=CoOrtholog
interTaxonMatchView=InterTaxonMatch
percentMatchCutoff=50
evalueExponentCutoff=-3
oracleIndexTblSpc=NONE


I tried manually installing DBD::mysql using code below, and I get another error about Devel::CheckLib

[DBD-mysql-4.050]$perl Makefile.PL  Can't locate Devel/CheckLib.pm in @INC (you may need to install the Devel::CheckLib module) Do you have any idea how to solve this? thank you. ADD REPLY 1 Entering edit mode That's a Perl error - you need to install Devel::CheckLib and DBD::mysql in your Perl installation, using either cpanm or cpan - have a look at the OrthoMCL manual, the required Perl libraries should be listed there ADD REPLY 0 Entering edit mode Thank you Philipp! may I ask two more questions? when I use makeblastdb in step 7, I don't get a resulting .fasta file as part of the output, only a .psq, .phr, and .pin file. Not sure if this is normal as I see in the above steps they got a .fasta file. Also how does one tell blastp to perform an ' all vs all ' blastp analysis? "blastall" is not recognized for me thanks for any input! ADD REPLY 2 Entering edit mode That's the resulting output of blastdb. It just make the database for you. After that, you need to run it in blastp. The blastall is included in the old blast version. The new one do not really have the blastall command. I think it's okay to just run blastp. Well, yeah. ADD REPLY 0 Entering edit mode That makes sense. thank you for the reply! ADD REPLY 0 Entering edit mode 4.4 years ago ashaneev07 ▴ 20 Question: orthomcl error: DBD::mysql::st execute failed: The table 'SimilarSequences' is full..... Hi... When I use orthomclLoadBlast, I got an error message. Anyone can give me suggestions? Thanks. home@home-Lenovo-H30-50:~/orthomclSoftware-v2.0.9/myOrthoMCL/ComplaintFasta$ orthomclLoadBlast /etc/orthomcl.config similarSequences.txt
DBD::mysql::st execute failed: The table 'SimilarSequences' is full at /usr/local/bin/orthomclLoadBlast line 39, <F> line 14.

0
Entering edit mode

is it the first time you (try to) run this? if not you first might need to empty the existing one (as the procedure will fail it the table exists and is not empty).

0
Entering edit mode

No. This is the second run. So, I've to remove the orthomcl database. Right?

0
Entering edit mode

Not the complete database, just empty it (or even just that specific table)

0
Entering edit mode

No. Do not delete the orthomcl database; you just need to flash the existing table from your first ran and then reconnect again to mysql. The instruction should be in the orthomcl manual.

0
Entering edit mode

you can of course drop the whole database but then you will need to re-create it (and thus go one step back in the whole orthomcl process), but as stated before, normally no need to do this

0
Entering edit mode

Thanks for the reply..

I've drop the orthomcl database and recreate the same.Then it ran smoothly. But, again showing errors when coming to the next step.

home@home-Lenovo-H30-50:~/orthomclSoftware-v2.0.9/myOrthoMCL/ComplaintFasta$orthomclPairs /etc/orthomcl.config orthomcl_pairs.log cleanup=no DBD::mysql::st execute failed: The total number of locks exceeds the lock table size at /usr/local/bin/orthomclPairs line 709, <F> line 14.  ADD REPLY 0 Entering edit mode 4.1 years ago lay_0 ▴ 50 Hi, thanks for posting the streamlined help. I was able to run everything up to step 8 and then, after submitting my command for step 9 I got the following error: DBD::mysql::st execute failed: Row 1 doesn't contain data for all columns at /usr/local/bin/orthomclLoadBlast line 39, <F> line 13.  I am not very familiar with SWL, can someone please hep me figuring out what is wrong? My command line was : /usr/local/bin/orthomclLoadBlast /usr/local/archive/orthomcl/orthomclSoftware-v2.0.9/bin/orthomclInstallSchema.config similarSequences.txt  And my config scheme looks like: dbVendor=mysql dbConnectString=dbi:mysql:orthomcl:localhost:3307 dbLogin=orthomcl dbPassword=orthomcl similarSequencesTable=SimilarSequences orthologTable=Ortholog inParalogTable=InParalog coOrthologTable=CoOrtholog interTaxonMatchView=InterTaxonMatch percentMatchCutoff=0 evalueExponentCutoff=0 oracleIndexTblSpc=what blastResultsTable=BlastResults  ADD COMMENT 0 Entering edit mode Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time. ADD REPLY 0 Entering edit mode can you post a head of your similarSequences.txt ? It seems that you are missing some fields in that file? Perhaps also add the blast command you execute in one of the previous steps. ADD REPLY 0 Entering edit mode Thanks a lot!, I could spot the error by looking at the SimilarSequences.txt files, I had a wrong header ( several limes of "processing XX genome" just before the columns start). That was because I used nohup to redirect output. Now my SimilarSequences head table looks like: Atra|XP_013197910.1 Atra|XP_013197910.1 Atra Atra 2 -77 100 100 Atra|XP_013197910.1 Prap|XP_022124259.1 Atra Prap 6 -56 100 68.5 Atra|XP_013197910.1 Dple|OWR42513.1 Atra Dple 6 -56 100 68.5 Atra|XP_013197910.1 Prap|XP_022124260.1 Atra Prap 8 -56 100 68.5 Atra|XP_013197910.1 Prap|XP_022124258.1 Atra Prap 8 -56 100 68.5 Atra|XP_013197910.1 Prap|XP_022124256.1 Atra Prap 8 -56 100 68.5 Atra|XP_013197910.1 Bmor|XP_021203058.1 Atra Bmor 8 -56 100 68.5 Atra|XP_013197910.1 Prap|XP_022124257.1 Atra Prap 8 -56 100 68.5 Atra|XP_013197910.1 Bmor|XP_021203057.1 Atra Bmor 1 -55 100 68.5 Atra|XP_013197910.1 Vtam|XP_026498474.1 Atra Vtam 1 -55 98.9 68.5  Everything worked fine now but at the end only 2 of the 6 genomes I added are in the final tables (Only Atra and Prap). My config file looks like this now, could that be the issue? dbVendor=mysql dbConnectString=dbi:mysql:orthomcl:localhost:3307 dbLogin=orthomcl dbPassword=orthomcl similarSequencesTable=SimilarSequences orthologTable=Ortholog inParalogTable=InParalog coOrthologTable=CoOrtholog interTaxonMatchView=InterTaxonMatch percentMatchCutoff=50 evalueExponentCutoff=-5 oracleIndexTblSpc=NONE blastResultsTable=BlastResults /usr/local/bin/orthomclInstallSchema.config (END)  many thanks again!. ADD REPLY 0 Entering edit mode 2.6 years ago krishdb38 • 0 I am also stucked in step 8. mysql Problem DBD::mysql::st execute failed: Loading local data is disabled; this must be enabled on both the client and server sides at ../orthomclLoadBlast line 39, <f> line 9. ADD COMMENT 0 Entering edit mode this sounds like a 'configuration' issue on your mysql server. Most likely you will need to talk to your local sys-admin to open this up for you (== such that you are able & allowed to submit data to the DBs) ADD REPLY 0 Entering edit mode Could you please let me know if you have fixed this problem? I have the same issue here. Thank you! ADD REPLY 0 Entering edit mode 2.6 years ago hypeanut • 0 Hi, I am at step 7:$ blastall -p blastp -F 'm S' -v 100000 -b 100000 -z 55 -e 1e-5 -d my_prot_blast_db -i goodProteins.fasta -o out.tab -m 8 After I ran this, terminal reported 'Warning: [blastp] The parameter -num_descriptions is ignored for output formats > 4 . Use -max_target_seqs to control output' Could someone please tell me if I should anything in the code? Or just run without changing anything? Thank you!

0
Entering edit mode

this is due to implementations of the newer versions of blast. In theory you can ignore this (blast will automagically change it for you). But better practise is to change that command line to use the correct parameter.

so change the -v 100000 -b 100000 part to -max_target_seqs 100000

0
Entering edit mode
2.2 years ago
Buffo ★ 2.0k

Hi, I have installed orthoMCL using conda, anybody knows how to set/get login and password (of mysql)?