10 months ago by
But is this Failure of GC content means GC bias?
What you see is that you have more reads with a GC content of greater than 50% than what FastQC would expect given a normal distribution based on the mode of your reads' GC content.
This may be indicative of GC bias, but it doesn't have to be, especially if you're not too interested in quantiative measures down the road.
Keep calm and carry on and just keep this in the back of your mind before drawing strong conclusions, e.g. about interesting enrichments seen for regions with 50-60% GC content.
because i think GC bias is related with coverage and depth of read(after mapping problem)
The GC content of each read can be determined irrespective of its location in the genome; after all, you only need to tally the types of bases you've sequenced, which is exactly the type of information that's stored in a fastq file.
But you are right insofar as that
FastQC's assumption about what a uniform sampling of your organism's genome should look like might be incorrect.
am i right think that GC content is difference with GC bias?
GC content simply describes the numbers of G's and C's that you sequence in relation to the numbers of A's and T's.
GC bias is typically used to describe the fact that the enzymes and conditions used for PCR amplification tend to more efficiently amplify reads with modest to medium-high GC content. There will always be some sort of GC bias in Illumina-based sequencing (the reference by Terry Speed and Benjamin Hochberg that Ranan pointed to is an enlightening read in that regard); it mostly becomes an issue if you are trying to compare the read numbers of different samples where one sample (type) had only mild GC bias while the other one shows dramatic GC bias.
modified 10 months ago
10 months ago by
Friederike ♦ 5.4k