I am new to the field of bioinformatics, sequence alignment and variant calling, but I have recently been hired to analyze whole genome sequences (paired-end) of grape vine accessions. To start, I have run FastQC on my fastq files and aggregated results using MultiQC. The quality scores were very good for all files (not shown) but I got worried with two other metrics.
First, the GC content is expected around 35% and looks OK for most individuals, except for one individual for which the GC content is around 50% and has a small peak around 67-70%. Interestingly, this second peak could be observed for most individuals while being very flattened. This not clear to me if this is the result of a contamination of anything else?
Secondly, the sequence duplication level looks OK for most individuals, except for three including the one identified based on GC content (in orange) and two other individuals (in red). The pattern looks very weird for these last two individuals as the duplication level is non-null only for even numbers (2, 4, 6...). I wonder how this is possible?
If anyone could help me interpret these results, that would be very nice of you. Also, I could use some guidelines on how to deal with those files (discard, contact sequencing company, run specific software...) Thanks for your help!