Question: Strange sequence content pattern appearing at the head of sequenced pieces in a fastq file.
3.6 years ago
United States


Below is part of the result of FastQC on a fastq file from a single-cell RNA-seq data. I noticed that the sequence content on the first 15nts is much more uneven that the rest. I cannot find a reasonable explanation.

I apologize for my ignorance but I'm still new to some of the techniques used in next gen sequencing. From my understanding, isn't it that shotgun sequencing can cut the sequence anywhere, so there shouldn't be a reason, for example, that the first nt has significantly higher frequency of C than any other bases, especially considering C in average is lower than T and A. It's almost like there are some frequently appearing specific sequence on the head of all the pieces sequenced, but I don't know why could that be. I'm sure a lot of people have seen similar pattern before.

This could be due to un-removed barcodes or any specific primers used in sample amplification?

3.6 years ago
Devon Ryan82k
Freiburg, Germany


What you're showing is expected and thought to be due to the "random hexamer priming" not actually being so random. Some kits produce this more than others. It's nothing to worry about.

