Question: Error running certain PAML - CodeML sequences
0
gravatar for gyler75
4.4 years ago by
gyler750
gyler750 wrote:

Hey Biostars,

Recently I have been working on using PAML to infer Ka/Ks values from different homologous sequences. The problem arose when i tried to run the program with 10 different sequences. After some troubleshooting, I found out that 3 of the 10 sequences (Bat, eleshrew and pig) are causing the program to kill or not run correctly.

The protein / nucleotide sequences were all downloaded from NCBI and then aligned using Pal2Nal and then MUSCLE.

Anyone know what can be causing the runtime error, and if so, a fix? The tests I am using are the M0 and the Branch tests.  I am running with the following control file.

https://drive.google.com/open?id=0B_Dt5BQxzmAJWWdjQmVnbExvaDg&authuser=0

Thank you for your time and answers.

My unrooted tree file is:
((Human,Chimp),Shrew,Bat,Flying,Eleshrew,Rabbit,Mouse,Dog,Pig);
https://drive.google.com/open?id=0B_Dt5BQxzmAJeFhwTTRCQy1weWs&authuser=0

Lastly this is the input protein homologous sequences.
https://drive.google.com/open?id=0B_Dt5BQxzmAJWk1ERGc0YXRvdWs&authuser=0

Thank you to whoever can help me get over this. It is very much appreciated!

Feel free to ask any questions, I will answer to the best of my abilities.

Ricardo

ADD COMMENTlink modified 4.4 years ago by Brice Sarver3.2k • written 4.4 years ago by gyler750

1. The files you linked are blocked.

2. Paste the error message you are getting.

 

ADD REPLYlink written 4.4 years ago by h.mon28k

Fixed, sorry about that. As for the error, its just the generic:

C:\Users\paml4.7\bin\codeml.exe -- killed

ADD REPLYlink written 4.4 years ago by gyler750

There really are commas on folder and file names, as shown in your control file?

ADD REPLYlink written 4.4 years ago by h.mon28k

There are in some of the folders, not in the files. Could that be causing the issue?

ADD REPLYlink written 4.4 years ago by gyler750

Three suggestions:

As a general rule, do not use spaces or other special characters on folders and file names, in particular for command-line applications. Use only letters and underscore. Other characters are allowed, but if you don't know which ones are and which aren't (I don't), keep it safe and use just letters and underscore.

Your phylip and nexus files seem odd, you should check them. And the eleshrew does not seem to be homologous to the other ones.

Check carefully the section "Windows annoyances" on PAML manual.

ADD REPLYlink modified 4.4 years ago • written 4.4 years ago by h.mon28k

I re-did all of the sequences and seem to be getting an error where some of the sequences run with PAML (in FASTA format) and others either stall or are killed. So for example, my aars runs but aars2 is killed. Mars runs, but mars2 stalls.

https://drive.google.com/folderview?id=0B_Dt5BQxzmAJflF1Q082RHBnMGhfbWF5cnEzMFJ5UVZ2amVZZk04eWJ5aXFBZ1hHczgyVDQ&usp=sharing

I will try playing with the formatting to see if that changes anything. Thank you for your help thus far!

 

 

 

ADD REPLYlink written 4.4 years ago by gyler750
0
gravatar for Brice Sarver
4.4 years ago by
Brice Sarver3.2k
United States
Brice Sarver3.2k wrote:

1. Your path to your output file has slashes in the wrong direction.

2. PAML takes sequence data files in Phylip or NEXUS format. Yours is FASTA.

The PAML manual.

ADD COMMENTlink written 4.4 years ago by Brice Sarver3.2k

It's weird, what I don't understand is when running with these same setings (FASTA, output file) while removing the bat, pig and eleshrew sequences from both the tree file and the sequences file, the program runs flawlessly. I did try to convert to NEXUS and Phylip format but kept running into the same error. I used http://www.ebi.ac.uk/Tools/sfc/readseq/. Thank you so much Brice for your help thus far!

ADD REPLYlink written 4.4 years ago by gyler750

That's really strange, though I've noticed some strange stuff going on with PAML before. Things will run, but you'll receive seg faults if the names start with a number, for example, or if they are longer than 10 characters even though relaxed Phylip format is acceptable. 

ADD REPLYlink written 4.4 years ago by Brice Sarver3.2k

Yea I had read up on the larger than 10 character names so I made sure to give codenames to the sequences just to keep it simple. I find no errors with my nexus or phylip format but still unable to get the program running. Any suggestions you may have? Thanks once again for all your help Brice! Really is hard to understand why those 3 sequences are in particular are giving me such a hard time.

Nexus format: https://drive.google.com/open?id=0B_Dt5BQxzmAJWFJhMkU1aGw1OVE&authuser=0

Phylip: https://drive.google.com/open?id=0B_Dt5BQxzmAJaWxGMnJFQW85TWc&authuser=0

ADD REPLYlink written 4.4 years ago by gyler750
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1959 users visited in the last hour