Question: The problem using samtools stats
0
gravatar for 934963534
2.3 years ago by
93496353420
93496353420 wrote:

There is information of "ACGT content per cycle" What does the cycle mean?

Also I see there are lines start with GCC and has the information of the persentages of A G C T, I wonder if it is the information of the A G C T distribution of each sequence base.

Thank you for your answering!

sequencing stats samtools • 553 views
ADD COMMENTlink modified 2.3 years ago by Devon Ryan91k • written 2.3 years ago by 93496353420

What is the command that you are running? Also, please share it's output.

ADD REPLYlink written 2.3 years ago by lakhujanivijay4.3k
0
gravatar for Devon Ryan
2.3 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

The wording of "ACGT content per cycle" comes from Sanger and Illumina sequencing. A "cycle" in this context is a base, so the first cycle is the first base in all alignments, the second cycle is the second base and so on. In some experiments (namely whole genome sequencing) one expects the amount of ACGT to be constant across "cycles". In many other types of experiments (e.g., RNAseq or amplicon sequencing), this is not the case. Either way, the graph output by FastQC is probably more useful than what samtools stats is giving you.

Yes, the GCC lines have the per-cycle ACGT content, which is why they're preceded by:

# ACGT content per cycle. Use `grep ^GCC | cut -f 2-` to extract this part. The columns are: cycle; A,C,G,T base counts as a percentage of all A/C/G/T bases [%]; and N and O counts as a percentage of all A/C/G/T bases [%]
ADD COMMENTlink written 2.3 years ago by Devon Ryan91k

Thanks, I have got it.

ADD REPLYlink written 2.3 years ago by 93496353420
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1136 users visited in the last hour