Hi~ I'm working on some WGBS data now.
After quality and adapter trimming, Sequence Duplication Levels and Per sequence GC content still cannot pass . In Per sequence GC content, the read peak is higher than theoretical distibution. Is this ok ?
Thank you very much if you can provide some help !
Here are some pictures from FastQC after trimming.
There is a small fluctuation at the first few bases.Should I trim it ? At the end, the sharp decrease of A at the last position is a result of removing the adapter sequence very stringently, i.e. even a single trailing A at the end is removed.
Should I deduplicate sequence during quality control ( before mapping ) or filtering reads after alignments using deduplicate_bismark ?