Question: GC content in DNA sequnce
1
gravatar for jaafari.omid
16 months ago by
jaafari.omid40
jaafari.omid40 wrote:

Hello Dears. I am working on GBS data. I have trimmed my data but for some samples the FastQC showed an error related to %GC content. for example before trimming the value of GC content was 46% and after trimming it reached to %47. Actually before trimming it did not show error while its %GC content was lower than after trimming. what should be the value of GC content for DNA sequences such as GBS. Thanks in advanced. Best regards, Omid

genome • 730 views
ADD COMMENTlink modified 16 months ago • written 16 months ago by jaafari.omid40
1

so the conclusion is your adapter was rich in GC ? :D %GC depends on your target and your organism ....

ADD REPLYlink written 16 months ago by Titus850

Don't be too worried about FastQC errors, they often don't make sense. The difference you see in GC content before and after trimming is very small. You also don't expect the percentage to be exactly 50% as it depends on the genome.

ADD REPLYlink written 16 months ago by Martombo2.4k

Thanks dears for your answers to my question.

ADD REPLYlink written 16 months ago by jaafari.omid40

And Is it possible to guide me about Kmer? based on my information Kmer is not important for RNAseq data but I don't know how much is it important for DNA sequences like GBS. using Stacks pipeline I trimmed the raw data but after that I still have kmer content.

ADD REPLYlink written 16 months ago by jaafari.omid40

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

ADD REPLYlink written 16 months ago by WouterDeCoster39k

Actually, no one really answered the question, but just "suggested not to take it as a big deal". I am having the same problem, but when I trimmed the sequence "GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGC", that appeared in overrepresented sequences and was suggested that could correspond to an adapter, the quality report just got worse, as it happened also to jaafari.omid. Also, if it´s an adapter, why the "adapter content" plot shows there´s no adapter present? I also blasted the sequence and it has a 93.5% match with Staphylococcus phage Andhra (I leave this info here in case it helps) which might mean that the sample has been contaminated with that phage. On the other hand, this sequence indeed appears in adapter catalogs (TrueSeq adapter). All of this is a bit confusing, so if someone has a well-funded explanation I would appreciate it. Thanks !

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by msimmer92180

Hi , so it should be an adaptater if it's appearing in adaptater list :) I don't really understand what you speak about when you said "the quality report just got worse" ? Best

ADD REPLYlink written 12 weeks ago by Titus850
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1524 users visited in the last hour