I am trying to calculate per-sequence GC3% of the cds I obtained from my RNA-seq assemblies .. I thought it could be a quick & trivial task, but it does not appear to be so straight-forward!
One way of doing that could be to extract every codon position from my cds multifasta files and then calculate GC% for each sequence. I tried to extract the 3rd codon positions for my cds .fasta (using the one-liner below), but it actually takes ages!
while read line; do if echo $line | grep -v ">"; then echo $line | sed "s/(..)./\1/g"; else echo $line > 3rd_postion.fasta; fi; done < cds.fasta