Question: very confused with GC bias
0
gravatar for 9521ljh
19 days ago by
9521ljh10
9521ljh10 wrote:

I have fastq files and find Per sequence GC content is not well shaped. Therefore I think it is contaminated.

enter image description here

But is this failure of GC content means GC bias?

Because I think GC bias is related with coverage and depth of read (after mapping problem)

but above picture is not mapped, just fastq file.

am i right think that GC content is difference with GC bias?

fastqc • 207 views
ADD COMMENTlink modified 13 days ago by chen1.9k • written 19 days ago by 9521ljh10
1

Hi, Your result seems find your distibution is closer with the theorical :)

ADD REPLYlink written 19 days ago by Titus850

is this from raw sample data or did you process it already?

ADD REPLYlink written 19 days ago by lieven.sterck4.8k

it is raw fastq file.

ADD REPLYlink written 19 days ago by 9521ljh10
5
gravatar for Friederike
18 days ago by
Friederike4.2k
United States
Friederike4.2k wrote:

But is this Failure of GC content means GC bias?

What you see is that you have more reads with a GC content of greater than 50% than what FastQC would expect given a normal distribution based on the mode of your reads' GC content. This may be indicative of GC bias, but it doesn't have to be, especially if you're not too interested in quantiative measures down the road. Keep calm and carry on and just keep this in the back of your mind before drawing strong conclusions, e.g. about interesting enrichments seen for regions with 50-60% GC content.

because i think GC bias is related with coverage and depth of read(after mapping problem)

The GC content of each read can be determined irrespective of its location in the genome; after all, you only need to tally the types of bases you've sequenced, which is exactly the type of information that's stored in a fastq file.

But you are right insofar as that FastQC's assumption about what a uniform sampling of your organism's genome should look like might be incorrect.

am i right think that GC content is difference with GC bias?

GC content simply describes the numbers of G's and C's that you sequence in relation to the numbers of A's and T's. GC bias is typically used to describe the fact that the enzymes and conditions used for PCR amplification tend to more efficiently amplify reads with modest to medium-high GC content. There will always be some sort of GC bias in Illumina-based sequencing (the reference by Terry Speed and Benjamin Hochberg that Ranan pointed to is an enlightening read in that regard); it mostly becomes an issue if you are trying to compare the read numbers of different samples where one sample (type) had only mild GC bias while the other one shows dramatic GC bias.

ADD COMMENTlink modified 18 days ago • written 18 days ago by Friederike4.2k
3
gravatar for lieven.sterck
19 days ago by
lieven.sterck4.8k
VIB, Ghent, Belgium
lieven.sterck4.8k wrote:

this is nothing to worry about. It simply shows the GC content of your read data. I would not say it deviates severely from the expected curve. It could be perhaps be due to the organisms you work on. Moreover, FastQC is very strict on its evaluation.

Here is an interesting link about all this: QCfail

What I am a little surprised about is that you all have green checks in the overview, I've seen this only very rarely :/

ADD COMMENTlink modified 19 days ago • written 19 days ago by lieven.sterck4.8k
1
gravatar for Ranan Jyoti Sarma
19 days ago by
Mizoram Univesity
Ranan Jyoti Sarma30 wrote:

This may help you. https://academic.oup.com/nar/article/40/10/e72/2411059

ADD COMMENTlink written 19 days ago by Ranan Jyoti Sarma30
0
gravatar for chen
13 days ago by
chen1.9k
OpenGene
chen1.9k wrote:

You should take a look at the GC content curves.

The fastp tool mahy help, see: https://github.com/OpenGene/fastp

ADD COMMENTlink written 13 days ago by chen1.9k

Some sequencers, like Illumina NovaSeq may have polyG in end of reads, which may affect GC curve. Use fastp to trim polyG and check the post-filtering data.

ADD REPLYlink written 13 days ago by chen1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 807 users visited in the last hour