How To Use Pal2Nal Commandline With Python
1
0
Entering edit mode
10.3 years ago
ishengomae ▴ 110

I recently learned Pal2Nal is a good tool to align nucleotide sequences for subsequent downstream analyses involving positive selection. But I don't know how to use it from command-line. The webserver of the tool require to upload protein alignments and corresponding nucleotide sequences(unaligned). I have thousands of sequences and it is impossible for me to use the web based tool and my boss has suggested I use the commandline version to which I should write a python script to do the alignment and "include some lines in the script to feed the resulting codon alignment to "codeml" program to calculate Ka, Ks values".

Could anyone please help me how to proceed? I can use python but I don't know what types of arguments are acceptable by a commandline version of the tool. Any resource or help will be appreciated. Thanks

python command-line • 6.8k views
ADD COMMENT
3
Entering edit mode
10.3 years ago
Biojl ★ 1.7k

Well, if you download the script from their website you will se that in fact Pal2Nal is a Perl script. You have all the arguments that it can take at the beggining of the perl script.

From python and with basic parameters for using your alignment in codeml you could call it like so:

os.system('perl pal2nal.pl' + protein_alignment +' '  + input_nucleotide_file + ' -output paml > ' + output_pal2nal) #Converting alignment to nucleotides

Where protein_alignment, input_nucleotide_file, output_pal2nal will be variables (I think you can guess from their name)

ADD COMMENT
0
Entering edit mode

@Bioji, thanks very much for the lead. After download I tried it first commandline to test if it works as so: [code] edson@samsung:~/pal2nal.v14$ perl pal2nal.pl test.aln test.nuc > -output paml -nogap [/code]

But things are yet to work out and this is what I get: [code] Can't open paml at pal2nal.pl line 335. [/code]

I think with your lead I'm a small step away but there one or few subtle issue I am missing. Could you help?

Thanks.

Ps. As a follow-up, it seems there are issues with the pal2nal output options. The default is "clustal" and if you call the commandline with default it just work. If you call with 'paml' or 'fasta' thats when "cant open paml at pal2nal.pl.." or "cant open fasta at pal2nal.pl.." log comes out.

How do you fix that?

ADD REPLY
0
Entering edit mode

I think you have an error in your command line. The '>' should be after specify the -output paml. Check again the command I wrote in the answer.

ADD REPLY

Login before adding your answer.

Traffic: 1892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6