Finding genes under positive selection in non-model species - Phylotranscriptomics approach ?
1
2
Entering edit mode
7 months ago
sunnykevin97 ▴ 450

Hi,

I work with non-model organisms trying to understand the genes under positive selection in non-model fish species.

I de novo assembled the transcripts using TRINITY

Removed redundant transcripts using CDHIT

Using Transdecoder predicted ORF

The longest ORFs were subjected to Orthofinder to know the MSA species alignment and the species tree (raxml-ng).

The species tree and MSA file seems like the orthofinder aligned the orthologs shared among the non-model fish species(9).

I'm interested to know what are the genes under positive selection for this I converted MSA.fa alignment file to a Phylip format and subjected into codeml/PAML package. It not working for my data, it because of large dataset I suppose.

Error from codeml : 386273 nucleotides, not a multiple of 3!%

Some help is needed. Does my approach is it correct ?

phylip file: 9 386273 SRR363205 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLETSLAEH SRR363207 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH SRR363206 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH SRR363205 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH SRR363202 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH SRR363201 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH SRR363204 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH SRR363203 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH SRR363205 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE

tree file:
 ((SRR363206:0.012416,SRR363205:0.069747):0.0036785,((SRR363205:0.013234,(SRR363205:0.00518,((SRR363203:0.00817,SRR363201:0.002449):0.000959(SRR363202:0.003255,SRR363204:0.003052):0.001105):0.002049):0.005519):0.005375,SRR363207:0.016243):0.0036785);%  

Suggestions please.

Thanks Kevin

rna_seq gene assembly ortholog • 352 views
ADD COMMENT
2
Entering edit mode
7 months ago
pinn ▴ 130

PAML doesn't work with aminoacid sequences as input, it works only with nucleotide sequences in phylip & paml formats.

ADD COMMENT
0
Entering edit mode

I realized lately, I had a MSA alignment file (orthologs aligned) how do I convert in to PAML format for positive selection analysis?

Suggestions please.

ADD REPLY

Login before adding your answer.

Traffic: 2462 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6