Question

RNAseq sample never as A in last base

1

Entering edit mode

5.1 years ago

i.sudbery 19k

When quality controling an RNAseq sample, I came across this:

The last base of the reads can be anything other than A. Has anyone ever come across this before?

QC shows nothing else of concern. No adaptor contamination, no over-represented sequences, normal looking per read GC content.

RNA-Seq fastqc • 1.5k views

ADD COMMENT • link updated 3.7 years ago by adi.rotem ▴ 20 • written 5.1 years ago by i.sudbery 19k

0

Entering edit mode

If you do a google image search with fastqc per base sequence content you will see examples of similar profiles.

ADD REPLY • link 5.1 years ago by GenoMax 141k

score 2 · Answer 1 · 2019-03-18

I noticed that before, but not with RNAseq. It was a kind of artificial set up, where the sequence of interest was exactly 42 bases long, and so we sequenced for just 43 bases (so that there would be 42 bases with post-phasing data). And I noticed that a lot of reads had the last letter trimmed, but only if it was expected to be an A.

I remade the fastqs by turning off adapter trimming in the sample sheet, and that made all the reads 42 bases as they should be.

We were using a pretty old version of bcl2fastq for a while, so I can't vouch for the fact that newer versions will show this same behavior.

score 2 · Answer 2 · 2019-03-18

2

Entering edit mode

5.1 years ago

lieven.sterck 15k

Yes, I'have seen that before as well (at least for DNA reads). It's a typical pattern for the read length +1th base, that base is near crap so it should (needs) to be removed anyway. That it's indicates it might be anything but A I'm not sure on (will need to check).

For more (related) info on this topic :

Why 50bp Illumina run produces 51bp long sequencs?

EDIT: I didn't even had to look for for an example: here is the fist random one I took from a dataset generated in 2018

enter image description here

ADD COMMENT • link 5.1 years ago by lieven.sterck 15k

0

Entering edit mode

Thanks. Quality trimming didn't remove this, but I'll have a go at hard trimming the last base. Normally I'd think that hard trimming the reads was just masking a problem.

ADD REPLY • link 5.1 years ago by i.sudbery 19k

0

Entering edit mode

If I understood correctly that last (extra base) does not get trimmed because it's 'rescued' by the previous bases I'm guessing that if you would put the window size to 1 with a low Qvalues it will get trimmed.

ADD REPLY • link 5.1 years ago by lieven.sterck 15k

score 0 · Answer 3 · 2020-08-13

0

Entering edit mode

3.7 years ago

adi.rotem ▴ 20

Did you trim the reads? If so, maybe you used a trimming option where adaptors of length 1 are removed (like the -O 1 option in cutadapt). If your adaptor starts with an A, this will remove all A's in the last base of the read.

ADD COMMENT • link 3.7 years ago by adi.rotem ▴ 20