let's say I have a fasta of a protein sequence
> albumin MKWVTFISLL FLFSSAYSRG ... ... ...
I want to split the sequence into all possible consecutive 8 amino acids (only in 1 direction, amino -> carboxyl) (And no looping(I don't know if it is the right expression), say, GMKWVTFIS)
> fasta.albumin1 MKWVTFIS >fasta.albumin2 KWVTFISL > fasta.albumin3 WVTFISLLF ... > fasta.albumin13 FSSAYSRG
And, I want to do this for all known human protein sequences.
How would I do it??? I need the result as a fasta or fasta files. And IDs for resulting 8-mer seuqeunces need to be unique.