Question: [R] How to use the consistency index calculation (phangorn) on multi FASTA alignments?
1
gravatar for Kame
19 months ago by
Kame20
UK
Kame20 wrote:

Hello,

Sorry in advance if my problem is very simple, I am only starting using R! :) Thanks a lot for your help in advance.

I have a bunch of DNA sequence alignments in FASTA format (~150 sequences, aligned, sometimes with gaps) and a corresponding phylogeny that I would like to use to calculate consistency indices (CI) in order to have a rough idea for homoplasy in various sequences of interest.

For this, I saw that I can use the CI() function from the R package phangorn. Below is what I do, but there is a problem as the R session aborts (bomb) as soon as I start running the last line below, either on my local machine or a large cluster with lots of RAM:

seqdata = read.FASTA(file="multifasta.fasta")
seqdataPhy = phyDat(seqdata, type="DNA")
phyloTree = read.tree(file="tree.nwk")
CI(phyloTree, seqdataPhy)

I suspect something goes wrong when I try to create the sequence alignment data file in phyDat format (maybe I am forgetting some options?), but I haven't found any documentation on what I would like to do, so I try here. Thanks a lot, I am grateful for any advice!

All the best,

KS

phangorn homoplasy R • 857 views
ADD COMMENTlink modified 19 months ago • written 19 months ago by Kame20

could you share your files?

ADD REPLYlink written 19 months ago by cpad011211k
0
gravatar for Kame
19 months ago by
Kame20
UK
Kame20 wrote:

Hi. I eventually found out that there was an inconcistency in the leaf names of my corresponding tree (rapidnj, that was used for this, adds single inverted commas around leaf names, which made CI() crash).

ADD COMMENTlink written 19 months ago by Kame20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1476 users visited in the last hour