Question: RNAseq sample never as A in last base
gravatar for i.sudbery
18 months ago by
Sheffield, UK
i.sudbery9.3k wrote:

When quality controling an RNAseq sample, I came across this:

The last base of the reads can be anything other than A. Has anyone ever come across this before?

QC shows nothing else of concern. No adaptor contamination, no over-represented sequences, normal looking per read GC content.

fastqc rna-seq • 651 views
ADD COMMENTlink modified 7 weeks ago by adi.rotem0 • written 18 months ago by i.sudbery9.3k

If you do a google image search with fastqc per base sequence content you will see examples of similar profiles.

ADD REPLYlink written 18 months ago by genomax90k
gravatar for swbarnes2
18 months ago by
United States
swbarnes28.7k wrote:

I noticed that before, but not with RNAseq. It was a kind of artificial set up, where the sequence of interest was exactly 42 bases long, and so we sequenced for just 43 bases (so that there would be 42 bases with post-phasing data). And I noticed that a lot of reads had the last letter trimmed, but only if it was expected to be an A.

I remade the fastqs by turning off adapter trimming in the sample sheet, and that made all the reads 42 bases as they should be.

We were using a pretty old version of bcl2fastq for a while, so I can't vouch for the fact that newer versions will show this same behavior.

ADD COMMENTlink written 18 months ago by swbarnes28.7k
gravatar for lieven.sterck
18 months ago by
VIB, Ghent, Belgium
lieven.sterck8.6k wrote:

Yes, I'have seen that before as well (at least for DNA reads). It's a typical pattern for the read length +1th base, that base is near crap so it should (needs) to be removed anyway. That it's indicates it might be anything but A I'm not sure on (will need to check).

For more (related) info on this topic :

Why 50bp Illumina run produces 51bp long sequencs?

EDIT: I didn't even had to look for for an example: here is the fist random one I took from a dataset generated in 2018

enter image description here

ADD COMMENTlink modified 18 months ago • written 18 months ago by lieven.sterck8.6k

Thanks. Quality trimming didn't remove this, but I'll have a go at hard trimming the last base. Normally I'd think that hard trimming the reads was just masking a problem.

ADD REPLYlink written 18 months ago by i.sudbery9.3k

If I understood correctly that last (extra base) does not get trimmed because it's 'rescued' by the previous bases I'm guessing that if you would put the window size to 1 with a low Qvalues it will get trimmed.

ADD REPLYlink written 18 months ago by lieven.sterck8.6k
gravatar for adi.rotem
7 weeks ago by
adi.rotem0 wrote:

Did you trim the reads? If so, maybe you used a trimming option where adaptors of length 1 are removed (like the -O 1 option in cutadapt). If your adaptor starts with an A, this will remove all A's in the last base of the read.

ADD COMMENTlink written 7 weeks ago by adi.rotem0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1529 users visited in the last hour