I've seen many posts and questions regarding the error flagged by fastqc on "Per base sequence content". My understanding is the variation at the sequence content at the front end is a normal occurrence for RNAseq on Illumina due to random priming, however, I couldn't find a straight answer if this is also normal for DNA seq?
I have done metagenomic sequencing (DNA) in which the library was made using NEBNext Ultra II FS Kit. And fastqc did flag warning for "Per base sequence content".
I attached the Per base sequence content plot here accessible from the link below.
Is this expected for DNA sequencing?
If the library was made using tagmentation then you will see that pattern.
Flag warnings on FastQC do not immediately indicate that the data is bad or that you can't move forward with your analysis.
Note: You appear to have not attached the image you were trying to include properly.
Yes, I have proceed with analysis but I wonder if it is necessary to remove the first 10 or so bp from the reads.
I have checked that NEBNext Ultra II FS Kit is not tagmentation-based (learned something new today!), so I am now curious what causes the abnormal base composition up front.
Tried to fix the image, hopefully it shows up now.
Unless the kit directions tell you to remove remove the initial 10-15 bp you may as well leave them alone. Data should align would be my intuition.
Thanks for fixing the image.
I think this is to be expected based a report I was able to find online
I also found this study comparing NEBNext Ultra II FS results to that of other kits https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-022-08316-y
Thanks I will have a look into them!