23 months ago by
The wording of "ACGT content per cycle" comes from Sanger and Illumina sequencing. A "cycle" in this context is a base, so the first cycle is the first base in all alignments, the second cycle is the second base and so on. In some experiments (namely whole genome sequencing) one expects the amount of ACGT to be constant across "cycles". In many other types of experiments (e.g., RNAseq or amplicon sequencing), this is not the case. Either way, the graph output by FastQC is probably more useful than what
samtools stats is giving you.
Yes, the GCC lines have the per-cycle ACGT content, which is why they're preceded by:
# ACGT content per cycle. Use `grep ^GCC | cut -f 2-` to extract this part. The columns are: cycle; A,C,G,T base counts as a percentage of all A/C/G/T bases [%]; and N and O counts as a percentage of all A/C/G/T bases [%]