I have to calculate KaKs ratio. Here is what I have: 1. Assembled transcripts (using trinity) 2. 100 genes sequences of a specific gene family 3. 80 protein sequences from the before mentioned genes.
Q1: How do I make the dataset for phylogenetic tree ? Should I mix the 1. (of course transcripts that mapped to sequences in 1. ) and 2. OR should I take only the sequences from 1. that mapped to sequences in 2.
Q2: How do I calculate KaKs using my transcript sequences(1.) and reference protein sequence(3.) ?
I have studied the PAML-PAL2NAL and MEGA5 pipeline and they perform multiple sequence alignment between same type of sequences(i.e. either mrna or proteins), which is where my case differs !!! Should I convert the selected transcripts from 1. to protein then perform msa ????
Any suggestion is highly valued and Thanks in advance