Entering edit mode
9.2 years ago
safaa2comp
•
0
how to prepare amino acids sequence (FASTA) format file to a form of matrix, that help me to classify the sequence. I read in some papers they get PSI_BLAST profile of 20* N vector, OR 20*(2W+1) where W refer to the number of neighborhood and 2 refer to the left and right sides and 1 refer to the residue itself , this mean if w =7 then it represent the residue and the 7 left neighbors and 7 right neighbors ? how could i do this?
Thanks
Wow. This is a loaded question. First, why would you want to do this, what are you expecting to gain as a final result? Second, why not use PSI-BLAST if distantly related sequences is what you are looking for? Why does it need to be a "HD vector"? If you are looking for a close neighbourhood, why not use a clustering algorithm like kclust or cd-hit? as I said, a loaded question that requires a bit more elaboration.
cheers
mxs