GC content in DNA sequnce
0
1
Entering edit mode
3.9 years ago
jaafari.omid ▴ 60

Hello Dears. I am working on GBS data. I have trimmed my data but for some samples the FastQC showed an error related to %GC content. for example before trimming the value of GC content was 46% and after trimming it reached to %47. Actually before trimming it did not show error while its %GC content was lower than after trimming. what should be the value of GC content for DNA sequences such as GBS. Thanks in advanced. Best regards, Omid

genome • 1.5k views
1
Entering edit mode

0
Entering edit mode

Don't be too worried about FastQC errors, they often don't make sense. The difference you see in GC content before and after trimming is very small. You also don't expect the percentage to be exactly 50% as it depends on the genome.

0
Entering edit mode

0
Entering edit mode

And Is it possible to guide me about Kmer? based on my information Kmer is not important for RNAseq data but I don't know how much is it important for DNA sequences like GBS. using Stacks pipeline I trimmed the raw data but after that I still have kmer content.

0
Entering edit mode

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

0
Entering edit mode

Actually, no one really answered the question, but just "suggested not to take it as a big deal". I am having the same problem, but when I trimmed the sequence "GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGC", that appeared in overrepresented sequences and was suggested that could correspond to an adapter, the quality report just got worse, as it happened also to jaafari.omid. Also, if it´s an adapter, why the "adapter content" plot shows there´s no adapter present? I also blasted the sequence and it has a 93.5% match with Staphylococcus phage Andhra (I leave this info here in case it helps) which might mean that the sample has been contaminated with that phage. On the other hand, this sequence indeed appears in adapter catalogs (TrueSeq adapter). All of this is a bit confusing, so if someone has a well-funded explanation I would appreciate it. Thanks !

0
Entering edit mode

Hi , so it should be an adaptater if it's appearing in adaptater list :) I don't really understand what you speak about when you said "the quality report just got worse" ? Best