Question: Why are GC% per base important in quality control of reads?
0
gravatar for c.clarido
6 months ago by
c.clarido40
Netherlands/Rotterdam/Leiden University (Applied Science)
c.clarido40 wrote:

Hello,

In quality control of reads, why do we look at the GC% per base position? I have the following result

Gem. lengtes: 75
Max. lengte: 101
Min. lengte: 24
GC globaal: 32%
GC per base position: 
[32, 33, 33, 33, 33, 33, 33, 32, 33, 33, 33, 33, 33, 33, 33, 32, 32, 32, 33, 33, 32, 33, 33, 33, 33, 32, 32, 32, 32, 32, 31, 31, 32, 31, 31, 31, 31, 30, 30, 30, 30, 30, 30, 30, 30, 30, 29, 29, 28, 28, 28, 28, 28, 27, 27, 27, 27, 27, 27, 26, 26, 25, 25, 25, 25, 24, 24, 24, 23, 23, 22, 22, 21, 21, 21, 20, 19, 19, 18, 18, 17, 16, 15, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 1, 0, 0]

Looking at the GC per base position, I can see that the GC% per base position decreases. So what can I conclude from this? Thank you in advance!

qc assembly • 293 views
ADD COMMENTlink modified 6 months ago by gb610 • written 6 months ago by c.clarido40

Looks like it is just biased because of the read length. After 24bp the number of read is decreasing, as the GC%.

ADD REPLYlink written 6 months ago by corend70

I also believe that there is a rule that you should mention that it is about a school assignment. Maybe a moderator can confirm that.

ADD REPLYlink written 6 months ago by gb610
0
gravatar for gb
6 months ago by
gb610
gb610 wrote:

In your case it can be a poly-A tail or something, did you trim of the primers/adapters and everything after the primers/adapters? Or will this still look like this after quality trimming? Are all the reads the same length?

I think you mostly use GC-content as a quality check if you compare it with the GC-content of a reference. So you expect that a certain species or chromosoom has a certain "specific" GC%. If you expect 40% on chromosoom x and it is 75% something is off.

EDIT:

I just noticed that there is a big difference in shortest and longest read so that plays a roll

ADD COMMENTlink modified 6 months ago • written 6 months ago by gb610

Yeah they look like this after quality trimming, so the lengths varies a lot. So I can assume they could be poly-A tail?

ADD REPLYlink written 6 months ago by c.clarido40

You could scan/trim for polyA and see if the length reduces even further.

ADD REPLYlink written 6 months ago by genomax65k

No you can not assume they could be poly-A tail. That's something you can see in de sequences. But if it are not poly-A tails it does not mean your data is wrong. You just have this result because the differences in length.

ADD REPLYlink modified 6 months ago • written 6 months ago by gb610
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 675 users visited in the last hour