How to use a protein alignment fasta file when using the function cluster of Kmer package

0

Entering edit mode

2.6 years ago

pamela.obando • 0

Hello everyone, I'm trying to use the function cluster of the kmer package in order to obtein a dendogram of a large set of protein sequence already aligned (fasta). The cluster function requires an input in the format AAbin, I used as_AAbin and it seem the values is not what it's expected for cluster. I used the example data "woodmouse" and It seems this is an array, it works; but in my case, "cadcbin" is not an array. Could you help?

library(seqinr)

cadcalignfil<- read.fasta("CadCfilalign.fasta", seqtype = "AA")

library(bioseq)

cadcv <- aa(cadcalignfil)

cadcbin <- as_AAbin(cadcv)

library(ape)

cluster(cadcbin, k=4)

Converting to Dayhoff(6) compressed alphabet for k > 3 Classes: AGPST, C, DENQ, FWY, HKR, ILMV

Error in kcount(x, k = k, residues = residues, gap = gap, named = FALSE) : minimum sequence length is less than k

my best regards Pam

arrays kmer AAbin • 669 views

ADD COMMENT • link updated 2.6 years ago by 5heikki 11k • written 2.6 years ago by pamela.obando • 0

0

Entering edit mode

I'm not familiar with this package but generally speaking for k-mer based clustering you want to input raw sequence, not sequence alignments..

ADD REPLY • link 2.6 years ago by 5heikki 11k

Login before adding your answer.