I want to slice sequences of fasta file,I take the first three sequences( I must calculate the length of each sequence), for example: I have this three sequences I want to divide each sequences on sub-sequences have the same length.
ie:length of the first is 28
, the second is 39
, and the third is 46
I divide each sequence on 9 28/9=3
the rest is 1
so the last sub-sequence contain one base 'G'
in this cases I must add this character '-'
, 39/9=4
( do the same thing as the first sequence),46/9=5
(the same )
>gi|2765658|emb|Z78533.1|CIZ78533 C.irapeanum 5.8S rRNA gene and ITS1 and ITS2 DNA
CGTAACAAGGTTTCCGTAGGTGAACCTG
>gi|2765657|emb|Z78532.1|CCZ78532 C.californicum 5.8S rRNA gene and ITS1 and ITS2 DNA
CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATC
>gi|2765646|emb|Z78521.1|CCZ78521 C.calceolus 5.8S rRNA gene and ITS1 and ITS2 DNA
GTAGGTGAACCTGCGGAAGGATCATTGTTGAGACAGTAGAATATAT
then, I take three sub-sequences from each sequences
CGTAACAAG GTTTCCGTA GGTGAACCT
CGTAACAAG GTTTCCGTA GGTGAACCT
GTAGGTGAA CCTGCGGAA GGATCATTG
then, I apply some function on each sub-group :
function1('CGTAACAAG'), function1('GTTTCCGTA'), ...
The same thing with
function2
I want to apply this on all sequences in fasta file, it means each time I take three sequences.
what can I do?