Test of selection across whole genomes
0
2
Entering edit mode
8.6 years ago

Hi,

I have a question regarding test of selection in PAML using full genomes. My dataset contains multiple individuals (2 to 8) for the 6 different species (A to F). For the analysis, I have two species (A and B) that have their genome annotated. The idea is two run the analysis twice, first using species A and secondly using species B as a reference. Doing this two-way analysis will provide some sort of 'control'. I have three main concerns/questions:

The first step would be to map each individual of each species to each reference (A and B) and then generate a consensus sequence for each species for the PAML analysis. So I would end up with 1 bam file per individual that I would use to build the consensus. Which tool would be best to use to generate this consensus? Do I need to generate a VCF file as well to take into accounts variants and replace them by N or the most common allele in the consensus?

Secondly, I would have to extract the sequences for orthologous genes (using the .gff file of each reference) and I assume that I would have to concatenate these exons to create my input file for the PAML analysis. Is it something that I should do in the previous step rather, when creating the consensus? Which approach would be the best?

Finally, as I said above this PAML analysis will be run in two ways, once using species A as a reference and once using species B. In order to make results comparable, both genomes (satsumasynteny) were aligned and the annotations of species B 'lifted-over' to species A using kraken, so that results will be comparable in the end. I have not used PAML yet, so maybe that's why I'm a bit confused, but how will I be able to compare the results from these two runs. Will PAML estimate dN/dS for each codon or each gene? Will the results be indexed, or how can I find to which region of my annotation it is associated to? Since there are no coordinates in my fasta file, how will I check for consistencies between runs (i.e. do both analyses show the same dN/dS in same genes?)?

Thanks a lot for your help!

Nic

selection PAML • 2.5k views
ADD COMMENT
0
Entering edit mode

I am also trying to do the same with PAML. How did you do it?

ADD REPLY

Login before adding your answer.

Traffic: 2937 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6