The clean reads with failed per base sequence content in fastQC
1
0
Entering edit mode
2.1 years ago
Xudong Li ▴ 10

I have sequenced the whole genome of dozens of bacteria, and I have done the quality check for the clean reads with fastqc. The sequences of each sample failed in "Per Base Sequence Content", because of GC/AT separation in the 1-10 bp. But the sequences of each sample passed all other test.

Below is a typical fastqc report for my data.

I have checked a lot of information, and there are many opinions that you can not care about this. But I am still very confused about what caused this phenomenon?

enter image description here

quality NGS check reads fastqc • 1.2k views
ADD COMMENT
2
Entering edit mode
2.1 years ago

I will just copy-pasted the help page of FASTQC because it nicely answers your question:

[...] some types of library will always produce biased sequence composition, normally at the start of the read. Libraries produced by priming using random hexamers (including nearly all RNA-Seq libraries) and those which were fragmented using transposases inherit an intrinsic bias in the positions at which reads start. This bias does not concern an absolute sequence, but instead provides enrichement of a number of different K-mers at the 5' end of the reads. Whilst this is a true technical bias, it isn't something which can be corrected by trimming and in most cases doesn't seem to adversely affect the downstream analysis. It will however produce a warning or error in this module.

In other word, "random hexamer priming" during library prep is not perfectly random, but it is usually fine.

ADD COMMENT
0
Entering edit mode

I think you are right! Thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 1818 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6