I have a question regarding test of selection in PAML using full genomes. My dataset contains multiple individuals (2 to 8) for the 6 different species (A to F). For the analysis, I have two species (A and B) that have their genome annotated. The idea is two run the analysis twice, first using species A and secondly using species B as a reference. Doing this two-way analysis will provide some sort of ‘control’. I have three main concerns/questions:
The first step would be to map each individual of each species to each reference (A and B) and then generate a consensus sequence for each species for the PAML analysis. So I would end up with 1 bam file per individual that I would use to build the consensus. Which tool would be best to use to generate this consensus? Do I need to generate a VCF file as well to take into accounts variants and replace them by N or the most common allele in the consensus?
Secondly, I would have to extract the sequences for orthologous genes (using the .gff file of each reference) and I assume that I would have to concatenate these exons to create my input file for the PAML analysis. Is it something that I should do in the previous step rather, when creating the consensus? Which approach would be the best?
Finally, as I said above this PAML analysis will be run in two ways, once using species A as a reference and once using species B. In order to make results comparable, both genomes (satsumasynteny) were aligned and the annotations of species B ‘lifted-over’ to species A using kraken, so that results will be comparable in the end. I have not used PAML yet, so maybe that’s why I’m a bit confused, but how will I be able to compare the results from these two runs. Will PAML estimate dN/dS for each codon or each gene? Will the results be indexed, or how can I find to which region of my annotation it is associated to? Since there are no coordinates in my fasta file, how will I check for consistencies between runs (i.e. do both analyses show the same dN/dS in same genes?) ?
Thanks a lot for your help!