Question: The problem using samtools stats
0
gravatar for 934963534
23 months ago by
93496353410
93496353410 wrote:

There is information of "ACGT content per cycle" What does the cycle mean?

Also I see there are lines start with GCC and has the information of the persentages of A G C T, I wonder if it is the information of the A G C T distribution of each sequence base.

Thank you for your answering!

sequencing stats samtools • 476 views
ADD COMMENTlink modified 23 months ago by Devon Ryan88k • written 23 months ago by 93496353410

What is the command that you are running? Also, please share it's output.

ADD REPLYlink written 23 months ago by bioExplorer3.7k
0
gravatar for Devon Ryan
23 months ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

The wording of "ACGT content per cycle" comes from Sanger and Illumina sequencing. A "cycle" in this context is a base, so the first cycle is the first base in all alignments, the second cycle is the second base and so on. In some experiments (namely whole genome sequencing) one expects the amount of ACGT to be constant across "cycles". In many other types of experiments (e.g., RNAseq or amplicon sequencing), this is not the case. Either way, the graph output by FastQC is probably more useful than what samtools stats is giving you.

Yes, the GCC lines have the per-cycle ACGT content, which is why they're preceded by:

# ACGT content per cycle. Use `grep ^GCC | cut -f 2-` to extract this part. The columns are: cycle; A,C,G,T base counts as a percentage of all A/C/G/T bases [%]; and N and O counts as a percentage of all A/C/G/T bases [%]
ADD COMMENTlink written 23 months ago by Devon Ryan88k

Thanks, I have got it.

ADD REPLYlink written 23 months ago by 93496353410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1159 users visited in the last hour