Tutorial: Easy way to run easily orthoMCL (Copy & paste)
9
gravatar for Esaie
3.2 years ago by
Esaie110
Canada
Esaie110 wrote:

After spent a lot of time to run the 13 step of orthoMCL I reached running it successfully. The difficulty for me it's not orthoMCL but, to know the usage of each command because the documentation it's not enought clear. Here I present the details of commands with parameter that I have used according to the OrthoMCL User's Guide http://orthomcl.org/common/downloads/software/v2.0/UserGuide.txt.

(1) I using MySql, You need first to create a database name : orthomcl this is my orthomcl.config.template file

 # this config assumes a mysql database named 'orthomcl'.  adjust according
 #to your situation. 

dbVendor=mysql 
dbConnectString=dbi:mysql:orthomcl 
dbLogin=your_login
dbPassword=your_password
similarSequencesTable=SimilarSequencesorthomcl
orthologTable=Orthologorthomcl
inParalogTable=InParalogorthomcl
coOrthologTable=CoOrthologorthomcl
interTaxonMatchView=InterTaxonMatchorthomcl
percentMatchCutoff=50
evalueExponentCutoff=-5
oracleIndexTblSpc=NONE

(2) et (3) obvious

(4) in the command line switch to the root of orthomcl file, and run this command, it will create tables in your orthomcl database

orthomclSoftware-v2.0.9$ bin/orthomclInstallSchema my_orthomcl_dir/orthomcl.config.template my_orthomcl_dir/install_schema.log

(5) according to orthomcl guide switch to

orthomclSoftware-v2.0.9$ cd my_orthomcl_dir/compliantFasta/

and after cow.fasta enter image description here

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclAdjustFasta cow /home/....../cow.fasta 1
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclAdjustFasta hsa /home/....../human.fasta 1
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclAdjustFasta mus /home/....../mouse.fasta 1

These commandes will create 3 files in compliantFasta directory name cow.fasta, hsa.fasta ans mus.fasta

(6)

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclFilterFasta . 10 20

(7) Step 7: All-v-all BLAST

You need first to create a local database in file for blast

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$  makeblastdb -in goodProteins.fasta -dbtype prot -out my_prot_blast_db

and

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ blastall -p blastp -F 'm S' -v 100000 -b 100000 -z 55 -e 1e-5 -d my_prot_blast_db -i goodProteins.fasta -o out.tab -m 8

(8) In my own case, I have create a directory name 'blast' compliantFasta directory in copy cow.fasta, hsa.fasta ans mus.fasta in this diectory by doing

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ mkdir blast

and

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ cp cow.fasta blast
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ cp hsa.fasta blast
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ cp mus.fasta blast

and

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclBlastParser out.tab blast/ >> similarSequences.txt

(9)

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclLoadBlast ../orthomcl.config.template similarSequences.txt

(10)

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclPairs ../orthomcl.config.template ../orthomcl_pairs.log cleanup=no

(11)

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclDumpPairsFiles ../orthomcl.config.template

(12)

orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ mcl mclInput --abc -I 1.5 -o mclOutput

(13)

orthomclSoftware/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclMclToGroups cluster 1000 < mclOutput > groups.txt

This is what I have done to execute successful.

I hope that will help you.

for those who like me love copy and paste

if you like please just vote, like or comment.

ADD COMMENTlink modified 8 months ago by RamRS24k • written 3.2 years ago by Esaie110

Hi Durelle. I would like to run orthoMCL too but I am quite confuse with the analysis itself. I am still new to this that's why there were some things that I am confuse a bit. Below are my questions and the error I went through when I tried running the orthoMCL.

1.) DATA. I'm not really sure about the input data in the pre-proccessing stage (orthomclAdjustFasta). I've read in one of the article that the input file is a protein sequence in a fasta format. What I have is my de novo assembled transcriptome. Can I use that in the orthoMCL? I would like to know the orthologs actually. Or, do I need to run Transdecoder then convert the .pep into fasta format and used that as my input file?

2.) I tried using my de novo transcriptome in the orthomclAdjustFasta but i met with an error while running the analysis. And if I can actually used my transcriptomic data, then it would be nice if you can shed some light with this error.

My assembled transcriptome contains >300,000 transcripts. A few of the first transcripts are:

TR1|c0_g2_i1 len=410 path=[443:0-227 459:228-257 457:258-287 448:288-311 449:312-409] [-1, 443, 459, 457, 448, 449, -2

TR2|c0_g1_i1 len=270 path=[292:0-92 293:93-112 294:113-269] [-1, 292, 293, 294, -2]

TR3|c0_g2_i1 len=386 path=[436:0-290 445:291-307 443:308-308 441:309-347 427:348-385] [-1, 436, 445, 443, 441, 427, -2]

TR3|c1_g1_i1 len=254 path=[232:0-253] [-1, 232, -2]

When I did the orthomclAdjustFasta i met with an error stating that TR3 is repeated (or something like that. Please go back to the example above) and thus terminates the run. I am using the ssh command and that's what I got in my .err output.

If I can used my data, what should I do to solve this error? I am thinking of renaming those repeated transcripts but I am afraid if I would jeopardize the result. Would that infer with the blast run? What analysis can you refer to me to edit those repeated transctips as fast as I could? Renaming it individually is very tedious.

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by bioinfool0

Hello stephravelo7!

I don't know if you have already fixed your problem. You must filter your .pep (protein) file and keep only unique (e.g. longest) isoforms in a fasta format style. File extension doesn't matter as long as it's correctly formatted. That is:

>c0_g2_i1
AAABBBCCC
>c0_g1_i1
AAABBBCCC
>c1_g1_i1
AAABBBCCC
...

That's what you should use as input for orthomclAdjustFasta.

Hope it can help :-)

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Santiago Montero-Mendieta120

Hi,

I am following the instructions and my analysis is running well until in step 9.

orthomclLoadBlast
$ orthomclLoadBlast orthomcl.config similarSequences.txt

Error: DBD::mysql::st execute failed: Data too long for column 'SUBJECT_ID' at row 1567
ADD REPLYlink modified 8 months ago by RamRS24k • written 16 months ago by bioinfool20

Hi Santiago,

Thank for the answer. Anyway, i ran with this error in orthomclDumpPairsFIles:

DBD::mysql::st execute failed: Error writing file '/tmp/MYXqNMk7' (Errcode: 28 - No space left on device) at /home/.../orthomclDumpPairsFiles line 54, <F> line 14.

Do you have any suggestion on how to solve this? Thanks.

ADD REPLYlink modified 8 months ago by RamRS24k • written 15 months ago by bioinfool20

This tutorial was really useful for me! Thanks!

ADD REPLYlink written 3.1 years ago by Santiago Montero-Mendieta120

Great. Thanks for posting your experience

ADD REPLYlink written 3.1 years ago by seta1.2k
0
gravatar for hudak7
17 months ago by
hudak70
hudak70 wrote:

When i run orthomclLoadBlast it says my SimilarSequences table is full. Any idea what these means? I have tons of space in my disk space.

ADD COMMENTlink written 17 months ago by hudak70

You can start a new question, these have a higher likelihood of getting answered.

Sounds like you still have a previous run that finished orthomclLoadBlast in the MySQL database. If you're sure that's your data then you can skip that step, I usually end up DROPping the whole database before running all OrthoMCL steps, like this, where the mysql username is root:

mysql -u root -p

DROP DATABASE orthomcl;

CREATE DATABASE orthomcl;

Then running everything from the start.

ADD REPLYlink written 17 months ago by Philipp Bayer6.4k

Hi,

I am following the instructions and my analysis is running well until in step 9.

orthomclLoadBlast

$ orthomclLoadBlast orthomcl.config similarSequences.txt

Error: DBD::mysql::st execute failed: Data too long for column 'SUBJECT_ID' at row 1567
ADD REPLYlink modified 8 months ago by RamRS24k • written 16 months ago by bioinfool20

The name of one or more genes are too long for the database!

Two ways to fix this:

  • write a small script (using Bopython or Bioperl) that cuts off the names of your IDs to below 60 characters

  • in the OrthoMCL script orthomclInstallSchema, change these two lines:

    QUERY_ID                 VARCHAR(60),
    SUBJECT_ID               VARCHAR(60),
    

to something like

 QUERY_ID                 VARCHAR(120),
 SUBJECT_ID               VARCHAR(120),

and rerun the script (you will probably encounter other database columns which are too short)

ADD REPLYlink modified 15 months ago • written 15 months ago by Philipp Bayer6.4k

After checking my similarSequences.txt, I think the problem is not with the no. of characters in the InstallSchema script but with the output generated in blastall following the command above. The no. of database size should be specified in -z command. In the script above, -z in blastall is 55 because that's the database size of its goodProteins.fasta. For you to know the db siz of your own data, I use this command: grep ">" goodProteins.fasta | wc. The first number will give you the database size, which you will need when you run blastall.

ADD REPLYlink written 15 months ago by bioinfool20

Hi Philipp,

Thanks for the help from my previous question. I meet again with another error in orthomclDumpPairsFiles.

DBD::mysql::st execute failed: Error writing file '/tmp/MYza1d1c' (Errcode: 28 - No space left on device) at /home/.../bin/orthomclDumpPairsFiles line 54, <F> line 14.

Do you have any idea how to solve this? Thanks!

ADD REPLYlink modified 8 months ago by RamRS24k • written 15 months ago by bioinfool20

Your hard-drive is full, delete unnecessary stuff :)

ADD REPLYlink written 15 months ago by Philipp Bayer6.4k

This is going to be a lame question but how do i check the space of my hd? Anyway, I cant delete files as of the moment so I am wondering if I can change the writing of the file in other folder instead of /tmp/MYza1d1c? I had this folder with 613G, so I use this folder by writing its path in the tmpdir of mysqld.cnf. Apparently, when I restart mysql to take effect the change I made, I got an error:

Job for mysql.service failed because the control process exited with error code. See "systemctl status mysql.service" and "journalctl -xe" for details.

I am really new to this so I am sorry if I'm asking too many questions.

ADD REPLYlink modified 8 months ago by RamRS24k • written 15 months ago by bioinfool20

You can use

df -h

to see how much space you have left on each device.

There's a chance that the mysql process is not allowed to write to the folder with 613G, so look at the rights of that folder using ls -lah

ADD REPLYlink written 15 months ago by Philipp Bayer6.4k

I am having problem at the step 8. I am getting this message over and over again.

DBD::mysql::st execute failed: The used command is not allowed with this MySQL version at .......orthomclSoftware-v2.0.9/bin/orthomclLoadBlast line 39, <f> line 14.

My command was:

...bin/orthomclLoadBlast .....orthomclSoftware-v2.0.9/my_orthomcl_dir/orthomcl.config.template ....orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta/similarSequences

I have installed mysql-8.0.11-macos10.13-x86_64.dmg in MacOS High Sierra 10.13.6

I have tried to add:

dbConnectString=dbi:mysql:orthomcl:mysql_local_infile=1

to my orthomcl.config.template file. Still not working.

Any idea how to solve this?

Thanks.

ADD REPLYlink modified 8 months ago by RamRS24k • written 14 months ago by Seq22590

Hi, it's kind off late but it means to me that there's incompatibility between the mysql and orthomcl you used. See in this link (C: OrthoMCL installation on Ubuntu Linux) the orthomcl installation that works from above command. Good luck!

ADD REPLYlink written 14 months ago by bioinfool20

Hi Philipp,

any idea why one would get such error message on step#4 (orthomclInstallSchema)

orthomclInstallSchema my_orthomcl_dir/orthomcl.config my_orthomcl_dir/install_schema.log

Can't locate DBD/mysql.pm in @INC (you may need to install the DBD::mysql module)

here is my orthomcl.config file:

dbVendor=mysql 
dbConnectString=dbi:mysql:mmax_orthomcl
dbLogin=mmax
dbPassword=PASSWORD
similarSequencesTable=SimilarSequences
orthologTable=Ortholog
inParalogTable=InParalog
coOrthologTable=CoOrtholog
interTaxonMatchView=InterTaxonMatch
percentMatchCutoff=50
evalueExponentCutoff=-3
oracleIndexTblSpc=NONE

I tried manually installing DBD::mysql using code below, and I get another error about Devel::CheckLib

[DBD-mysql-4.050]$ perl Makefile.PL

Can't locate Devel/CheckLib.pm in @INC (you may need to install the Devel::CheckLib module)

Do you have any idea how to solve this? thank you.

ADD REPLYlink written 6 months ago by max_19110
1

That's a Perl error - you need to install Devel::CheckLib and DBD::mysql in your Perl installation, using either cpanm or cpan - have a look at the OrthoMCL manual, the required Perl libraries should be listed there

ADD REPLYlink written 6 months ago by Philipp Bayer6.4k

Thank you Philipp! may I ask two more questions? when I use makeblastdb in step 7, I don't get a resulting .fasta file as part of the output, only a .psq, .phr, and .pin file. Not sure if this is normal as I see in the above steps they got a .fasta file.

Also how does one tell blastp to perform an ' all vs all ' blastp analysis? "blastall" is not recognized for me

thanks for any input!

ADD REPLYlink written 6 months ago by max_19110
2

That's the resulting output of blastdb. It just make the database for you. After that, you need to run it in blastp. The blastall is included in the old blast version. The new one do not really have the blastall command. I think it's okay to just run blastp. Well, yeah.

ADD REPLYlink written 6 months ago by bioinfool20

That makes sense. thank you for the reply!

ADD REPLYlink written 6 months ago by max_19110
0
gravatar for ashaneev07
12 months ago by
ashaneev0710
ashaneev0710 wrote:

Question: orthomcl error: DBD::mysql::st execute failed: The table 'SimilarSequences' is full.....

Hi... When I use orthomclLoadBlast, I got an error message. Anyone can give me suggestions? Thanks.

home@home-Lenovo-H30-50:~/orthomclSoftware-v2.0.9/myOrthoMCL/ComplaintFasta$ orthomclLoadBlast /etc/orthomcl.config similarSequences.txt
DBD::mysql::st execute failed: The table 'SimilarSequences' is full at /usr/local/bin/orthomclLoadBlast line 39, <F> line 14.
ADD COMMENTlink modified 8 months ago by RamRS24k • written 12 months ago by ashaneev0710

is it the first time you (try to) run this? if not you first might need to empty the existing one (as the procedure will fail it the table exists and is not empty).

ADD REPLYlink written 12 months ago by lieven.sterck5.8k

No. This is the second run. So, I've to remove the orthomcl database. Right?

ADD REPLYlink written 12 months ago by ashaneev0710

Not the complete database, just empty it (or even just that specific table)

ADD REPLYlink written 12 months ago by lieven.sterck5.8k

No. Do not delete the orthomcl database; you just need to flash the existing table from your first ran and then reconnect again to mysql. The instruction should be in the orthomcl manual.

ADD REPLYlink written 12 months ago by bioinfool20

you can of course drop the whole database but then you will need to re-create it (and thus go one step back in the whole orthomcl process), but as stated before, normally no need to do this

ADD REPLYlink written 12 months ago by lieven.sterck5.8k

Thanks for the reply..

I've drop the orthomcl database and recreate the same.Then it ran smoothly. But, again showing errors when coming to the next step.

home@home-Lenovo-H30-50:~/orthomclSoftware-v2.0.9/myOrthoMCL/ComplaintFasta$ orthomclPairs /etc/orthomcl.config orthomcl_pairs.log cleanup=no

DBD::mysql::st execute failed: The total number of locks exceeds the lock table size at /usr/local/bin/orthomclPairs line 709, <F> line 14.
ADD REPLYlink modified 8 months ago by RamRS24k • written 12 months ago by ashaneev0710
0
gravatar for lay_0
8 months ago by
lay_050
lay_050 wrote:

Hi, thanks for posting the streamlined help.

I was able to run everything up to step 8 and then, after submitting my command for step 9 I got the following error:

DBD::mysql::st execute failed: Row 1 doesn't contain data for all columns at /usr/local/bin/orthomclLoadBlast line 39, <F> line 13.

I am not very familiar with SWL, can someone please hep me figuring out what is wrong?

My command line was :

/usr/local/bin/orthomclLoadBlast /usr/local/archive/orthomcl/orthomclSoftware-v2.0.9/bin/orthomclInstallSchema.config similarSequences.txt

And my config scheme looks like:

dbVendor=mysql
dbConnectString=dbi:mysql:orthomcl:localhost:3307
dbLogin=orthomcl
dbPassword=orthomcl
similarSequencesTable=SimilarSequences
orthologTable=Ortholog
inParalogTable=InParalog
coOrthologTable=CoOrtholog
interTaxonMatchView=InterTaxonMatch
percentMatchCutoff=0
evalueExponentCutoff=0
oracleIndexTblSpc=what
blastResultsTable=BlastResults
ADD COMMENTlink modified 8 months ago by RamRS24k • written 8 months ago by lay_050

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

ADD REPLYlink written 8 months ago by RamRS24k

can you post a head of your similarSequences.txt ? It seems that you are missing some fields in that file? Perhaps also add the blast command you execute in one of the previous steps.

ADD REPLYlink written 8 months ago by lieven.sterck5.8k

Thanks a lot!, I could spot the error by looking at the SimilarSequences.txt files, I had a wrong header ( several limes of "processing XX genome" just before the columns start). That was because I used nohup to redirect output. Now my SimilarSequences head table looks like:

Atra|XP_013197910.1     Atra|XP_013197910.1     Atra    Atra    2       -77     100     100
Atra|XP_013197910.1     Prap|XP_022124259.1     Atra    Prap    6       -56     100     68.5
Atra|XP_013197910.1     Dple|OWR42513.1 Atra    Dple    6       -56     100     68.5
Atra|XP_013197910.1     Prap|XP_022124260.1     Atra    Prap    8       -56     100     68.5
Atra|XP_013197910.1     Prap|XP_022124258.1     Atra    Prap    8       -56     100     68.5
Atra|XP_013197910.1     Prap|XP_022124256.1     Atra    Prap    8       -56     100     68.5
Atra|XP_013197910.1     Bmor|XP_021203058.1     Atra    Bmor    8       -56     100     68.5
Atra|XP_013197910.1     Prap|XP_022124257.1     Atra    Prap    8       -56     100     68.5
Atra|XP_013197910.1     Bmor|XP_021203057.1     Atra    Bmor    1       -55     100     68.5
Atra|XP_013197910.1     Vtam|XP_026498474.1     Atra    Vtam    1       -55     98.9    68.5

Everything worked fine now but at the end only 2 of the 6 genomes I added are in the final tables (Only Atra and Prap). My config file looks like this now, could that be the issue?

dbVendor=mysql
dbConnectString=dbi:mysql:orthomcl:localhost:3307
dbLogin=orthomcl
dbPassword=orthomcl
similarSequencesTable=SimilarSequences
orthologTable=Ortholog
inParalogTable=InParalog
coOrthologTable=CoOrtholog
interTaxonMatchView=InterTaxonMatch
percentMatchCutoff=50
evalueExponentCutoff=-5
oracleIndexTblSpc=NONE
blastResultsTable=BlastResults
/usr/local/bin/orthomclInstallSchema.config (END)

many thanks again!.

ADD REPLYlink modified 8 months ago • written 8 months ago by lay_050
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1462 users visited in the last hour