I am quality checking a gene sequence using FASTQC.
It has given me warnings about the sequence duplication (44%), per base sequence content and kier content.
Should I trim this sequence using trim galore or cut adapt to rectify these problems??
I tried this using q-30 and removing the adapter sequence AGATCGGAAGAGC however this returned my paired sequences with even more warnings with fails on per base sequence content and GC content.
So I am not sure how to proceed as whilst fixing the warnings of the oriingal sequence I have created new problems by trimming it.
Any help/advice would be appreciated :)
Does the following from your original post mean that you are looking at just one gene (amplicon sequencing)?
If that is the case then you would expect a lot of duplication. Have you scanned your data with a trimming program to ensure that there is no adapter contamination?
Sorry I meant to say a genome sequence.
TGCTG is a sequence identified in the 'K mer content' with count 752200.