Question

Strange sequence content pattern appearing at the head of sequenced pieces in a fastq file.

1

Entering edit mode

9.3 years ago

zhuxun2 ▴ 20

Below is part of the result of FastQC on a fastq file from a single-cell RNA-seq data. I noticed that the sequence content on the first 15nts is much more uneven that the rest. I cannot find a reasonable explanation.

I apologize for my ignorance but I'm still new to some of the techniques used in next gen sequencing. From my understanding, isn't it that shotgun sequencing can cut the sequence anywhere, so there shouldn't be a reason, for example, that the first nt has significantly higher frequency of C than any other bases, especially considering C in average is lower than T and A. It's almost like there are some frequently appearing specific sequence on the head of all the pieces sequenced, but I don't know why could that be. I'm sure a lot of people have seen similar pattern before.

< image not found >

fastqc fastq RNA-Seq single-cell • 2.2k views

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.3 years ago by zhuxun2 ▴ 20

0

Entering edit mode

This could be due to un-removed barcodes or any specific primers used in sample amplification?

ADD REPLY • link updated 2.1 years ago by Ram 43k • written 9.3 years ago by Peter 6.0k

Ram · Answer 1 · 2015-01-14

2

Entering edit mode

9.3 years ago

Devon Ryan 104k

What you're showing is expected and thought to be due to the "random hexamer priming" not actually being so random. Some kits produce this more than others. It's nothing to worry about.

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.3 years ago by Devon Ryan 104k