Question: How To Do Alignment, Stop Codon Removal And Dn/Ds Calulation In One Go?
gravatar for Nari
7.9 years ago by
United States
Nari880 wrote:

I have over 1000 files each having 30 sequences. Manually aligning, removing stop codons and then calculating avarage dN/dS for each file is impossible for me.
Are there ways to perform this via command drive.
(I know PAML, but no tool known for aligning in paml format and for removing stop codons)
Even 3 different tools for each step will do, the thing is just that I should be able to do it from command prompt.

(I'm on Win7)

Thanks in advance.

paml • 10k views
ADD COMMENTlink modified 13 months ago by Vincent Ranwez30 • written 7.9 years ago by Nari880

what aligner would you like to use? Most, if not all, have a command line. Deleting the stop codons afterwards should be "trivial". E.g. biopython has an interface for most aligners and will run PAML as well. However, please keep in mind that dN/dS calculations are (obviously) very dependent on a good alignment. The huge downside of this automated approach will be that you will likely not quality check each alignment before moving on.

ADD REPLYlink written 7.9 years ago by Whetting1.5k

Hello, I have a similar problem?In big data,I must delete the stop codons in the sequence.So,can you give me some suggestions?


ADD REPLYlink written 4.4 years ago by wangqingqing0
gravatar for jprmachado
7.9 years ago by
jprmachado60 wrote:


Few time ago i got the same problem. I solved using a perl script available here (. Since is needed to feed with both nuclotides and amino acids i have used t-coffee to translate.

This have worked fine for me. I have done in linux, for windows you may neeed to write a .bat file to it easily. Take a look on bat files tutorial for syntax if you are not familiar with that. I think that will work.

You can use fasta format as sequence file for PAML no need of .pml format.



ADD COMMENTlink modified 7.9 years ago • written 7.9 years ago by jprmachado60

Thanks so much!!

ADD REPLYlink written 5.8 years ago by tlorin310

Hi @jprmachado

I am working dn/ds for 4431 gene clusters, i tried to remove stopcodons by pal2nal program. but for some reason around 198 cultures still has the stop codons.

now i have tried to remove the stopcodons with the above script but i don't get any error and stop codons are not removed.

any suggestions thank you

ADD REPLYlink written 3.9 years ago by krp000130

I tried with dummy data set it worked but in the actual data set it not working when i run to calculate dn/ds

ADD REPLYlink written 7 months ago by krushnach80870
gravatar for SES
7.9 years ago by
Vancouver, BC
SES8.4k wrote:

Pal2Nal will generate a codon alignment without stop codons, given a MSA of proteins and the corresponding DNA sequences. If the input is a pairwise alignment, I believe it will calculate dN/dS ratios (using PAML) for you automatically. Otherwise, you can just input your alignments to PAML to calculate dN/dS. Pal2Nal is written in Perl, so it should work on your Win7 machine (I don't know about PAML though, unless there is a Windows version available).

ADD COMMENTlink written 7.9 years ago by SES8.4k

should the nucleotide alignment be trimmed? And also the protein alignment be trimmed?

ADD REPLYlink written 4.7 years ago by lilepisorus30
gravatar for Vincent Ranwez
13 months ago by
Vincent Ranwez30 wrote:


the MACSE_V2 toolkit provides several tools to deal with nuceotide coding sequences. The alignSequences subprogram of MACSE allows building reliable codon alignments even in the presence of frameshifts of stop codons (especially useful for dN/dS analysis and pseudogene analysis). Morevover, this subprogam can handle the fact that different sequences use different genetic codes. MACSE also includes a subprogram specifically designed to replace stop codons (and frameshift codons) from an alignment. This subprogam (exportAlignment) allows to specify the codon (three letters of your choice) that will replace the stop codons. You can even provide two different codons for replacing stops appearing within the sequence (unexpected unless in pseudogenes) and stop codons appearing at the end of the sequences. While there is several options (e.g. to specify the output file name and the genetic code to use) the basic usage is quite straightforward:

java -jar macse.jar -prog exportAlignment -align align.fasta -codonForFinalStop --- -codonForInternalStop NNN

To ease the alignment of coding nucleotide sequences, we also provide ready to use alignment pipelines (provided as singularity container), which include optional filtering steps. These pipelines output the (filtered) nucleotide alignment, the corresponding (filtered) amino acid ones and the detail of the filtering steps (if some filtering steps were selected).

ADD COMMENTlink written 13 months ago by Vincent Ranwez30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1530 users visited in the last hour