Test of selection across whole genomes
Entering edit mode
6.7 years ago


I have a question regarding test of selection in PAML using full genomes. My dataset contains multiple individuals (2 to 8) for the 6 different species (A to F). For the analysis, I have two species (A and B) that have their genome annotated. The idea is two run the analysis twice, first using species  A and secondly using species B  as a reference. Doing this two-way analysis will provide some sort of ‘control’. I have three main concerns/questions:

The first step would be to map each individual of each species to each reference (A and B) and then generate a consensus sequence for each species for the PAML analysis. So I  would end up with 1 bam file per individual that I would use to build the consensus. Which tool would be best to use to generate this consensus? Do I need to generate a VCF file as well to take into accounts variants and replace them by N or the most common allele in the consensus?

Secondly, I would have to extract the sequences for orthologous genes (using the .gff file of each reference) and I assume that I would have to concatenate these exons to create my input file for the PAML analysis. Is it something that I should do in the previous step rather, when creating the consensus? Which approach would be the best?

Finally, as I said above this PAML analysis will be run in two ways, once using species A as a reference and once using species B. In order to make results comparable, both genomes (satsumasynteny) were aligned and the annotations of species B ‘lifted-over’ to species A using kraken, so that results will be comparable in the end. I have not used PAML yet, so maybe that’s why I’m a bit confused, but how will I be able to compare the results from these two runs. Will PAML estimate dN/dS for each codon or each gene? Will the results be indexed, or how can I find  to which region of my annotation it is associated to? Since there are no coordinates in my fasta file, how will I check for consistencies between runs (i.e. do both analyses show the same dN/dS in same genes?) ?


Thanks a lot for your help!


PAML selection • 2.2k views
Entering edit mode
6.2 years ago
ragavishn ▴ 20

I am also trying to do the same with PAML. How did you do it ?


Login before adding your answer.

Traffic: 2179 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6