Entering edit mode
11.1 years ago
2011101101
▴
110
I have three samples,they are small RNAs and these three or more samples were sequenced by Solexa high-thoughput sequencing.They are fasta format,the length of reads are 18-28nt.I want to statistics the number of identical sequence between different samples . their names are a.fa,b.fa,c.fa, For example.the a.fa,b.fa,c.fa there is a similar quention,but I don't know how modify it, I hope you can do it .Statistics The Number Of Identical Sequence Between Different Samples
a.fa
>1_x2
ATCG
>2_x3
ACTG
>3_x1
GAAG
b.fa
>a_x5
GAAG
>b_x3
ATCG
c.fa
>c_x1
ACTG
>2_x2
AAG
The result is like below.
sequence a b c
ATCG 2 3 0
ACTG 3 0 1
GAAG 1 5 0
AAG 0 0 2