I am not familiar with perl (I mostly work with R) and I was hoping someone would help me with this problem. I have this test.fasta file below. I need to get this table from this fasta file where I have sequence length starting from minimum sequence length to maximum sequence length. Then I want to get the counts for sequences with the start bases ( first base in 5') along with the number of sequences for that given length. For example, In the given fasta below, seq1, seq5 and seq6 are 12 bases long, so in the result table below I have length 12, number of sequences of 12 bases length = 3 and the sequences starting with A= 2 and sequences starting with T =1 and so forth. How do I get this done in perl? Thank you for your help.
test.fasta >seq1 AATTGGTTTGTT >seq2 AATTGTGGGTGGTTGT >seq3 TGGTTTGGGTGGTAA >seq4 TTGGGGTAAAAAAATTTAA >seq5 TATTGGTTTGTT >seq6 AAGTGGTTTGTT
Result I want (shown only for the sequence of length 12bases).
length number A T G C 12 3 2 1 0 0