Question: Test of selection across whole genomes
2
gravatar for nicolas.dussex
3.6 years ago by
New Zealand
nicolas.dussex30 wrote:

Hi,

I have a question regarding test of selection in PAML using full genomes. My dataset contains multiple individuals (2 to 8) for the 6 different species (A to F). For the analysis, I have two species (A and B) that have their genome annotated. The idea is two run the analysis twice, first using species  A and secondly using species B  as a reference. Doing this two-way analysis will provide some sort of ‘control’. I have three main concerns/questions:

The first step would be to map each individual of each species to each reference (A and B) and then generate a consensus sequence for each species for the PAML analysis. So I  would end up with 1 bam file per individual that I would use to build the consensus. Which tool would be best to use to generate this consensus? Do I need to generate a VCF file as well to take into accounts variants and replace them by N or the most common allele in the consensus?

Secondly, I would have to extract the sequences for orthologous genes (using the .gff file of each reference) and I assume that I would have to concatenate these exons to create my input file for the PAML analysis. Is it something that I should do in the previous step rather, when creating the consensus? Which approach would be the best?

Finally, as I said above this PAML analysis will be run in two ways, once using species A as a reference and once using species B. In order to make results comparable, both genomes (satsumasynteny) were aligned and the annotations of species B ‘lifted-over’ to species A using kraken, so that results will be comparable in the end. I have not used PAML yet, so maybe that’s why I’m a bit confused, but how will I be able to compare the results from these two runs. Will PAML estimate dN/dS for each codon or each gene? Will the results be indexed, or how can I find  to which region of my annotation it is associated to? Since there are no coordinates in my fasta file, how will I check for consistencies between runs (i.e. do both analyses show the same dN/dS in same genes?) ?

 

Thanks a lot for your help!

Nic

selection paml • 1.5k views
ADD COMMENTlink modified 3.1 years ago by ragavishn20 • written 3.6 years ago by nicolas.dussex30
0
gravatar for ragavishn
3.1 years ago by
ragavishn20
United States
ragavishn20 wrote:

I am also trying to do the same with PAML. How did you do it ?

ADD COMMENTlink written 3.1 years ago by ragavishn20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1326 users visited in the last hour