How to use a protein alignment fasta file when using the function cluster of Kmer package
0
0
Entering edit mode
8 weeks ago

Hello everyone, I'm trying to use the function cluster of the kmer package in order to obtein a dendogram of a large set of protein sequence already aligned (fasta). The cluster function requires an input in the format AAbin, I used as_AAbin and it seem the values is not what it's expected for cluster. I used the example data "woodmouse" and It seems this is an array, it works; but in my case, "cadcbin" is not an array. Could you help?

library(seqinr)

cadcalignfil<- read.fasta("CadCfilalign.fasta", seqtype = "AA")

library(bioseq)

cadcv <- aa(cadcalignfil)

cadcbin <- as_AAbin(cadcv)

library(ape)

cluster(cadcbin, k=4)

Converting to Dayhoff(6) compressed alphabet for k > 3 Classes: AGPST, C, DENQ, FWY, HKR, ILMV

Error in kcount(x, k = k, residues = residues, gap = gap, named = FALSE) : minimum sequence length is less than k

my best regards Pam

arrays kmer AAbin • 187 views
ADD COMMENT
0
Entering edit mode

I'm not familiar with this package but generally speaking for k-mer based clustering you want to input raw sequence, not sequence alignments..

ADD REPLY

Login before adding your answer.

Traffic: 2547 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6