Question: GC content in DNA sequnce
gravatar for jaafari.omid
2.2 years ago by
jaafari.omid50 wrote:

Hello Dears. I am working on GBS data. I have trimmed my data but for some samples the FastQC showed an error related to %GC content. for example before trimming the value of GC content was 46% and after trimming it reached to %47. Actually before trimming it did not show error while its %GC content was lower than after trimming. what should be the value of GC content for DNA sequences such as GBS. Thanks in advanced. Best regards, Omid

genome • 986 views
ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by jaafari.omid50

so the conclusion is your adapter was rich in GC ? :D %GC depends on your target and your organism ....

ADD REPLYlink written 2.2 years ago by Titus910

Don't be too worried about FastQC errors, they often don't make sense. The difference you see in GC content before and after trimming is very small. You also don't expect the percentage to be exactly 50% as it depends on the genome.

ADD REPLYlink written 2.2 years ago by Martombo2.6k

Thanks dears for your answers to my question.

ADD REPLYlink written 2.2 years ago by jaafari.omid50

And Is it possible to guide me about Kmer? based on my information Kmer is not important for RNAseq data but I don't know how much is it important for DNA sequences like GBS. using Stacks pipeline I trimmed the raw data but after that I still have kmer content.

ADD REPLYlink written 2.2 years ago by jaafari.omid50

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

ADD REPLYlink written 2.2 years ago by WouterDeCoster43k

Actually, no one really answered the question, but just "suggested not to take it as a big deal". I am having the same problem, but when I trimmed the sequence "GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGC", that appeared in overrepresented sequences and was suggested that could correspond to an adapter, the quality report just got worse, as it happened also to jaafari.omid. Also, if it´s an adapter, why the "adapter content" plot shows there´s no adapter present? I also blasted the sequence and it has a 93.5% match with Staphylococcus phage Andhra (I leave this info here in case it helps) which might mean that the sample has been contaminated with that phage. On the other hand, this sequence indeed appears in adapter catalogs (TrueSeq adapter). All of this is a bit confusing, so if someone has a well-funded explanation I would appreciate it. Thanks !

ADD REPLYlink modified 13 months ago • written 13 months ago by msimmer92250

Hi , so it should be an adaptater if it's appearing in adaptater list :) I don't really understand what you speak about when you said "the quality report just got worse" ? Best

ADD REPLYlink written 13 months ago by Titus910
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1209 users visited in the last hour