RNAseq sample never as A in last base
3
1
Entering edit mode
5.1 years ago

When quality controling an RNAseq sample, I came across this:

The last base of the reads can be anything other than A. Has anyone ever come across this before?

QC shows nothing else of concern. No adaptor contamination, no over-represented sequences, normal looking per read GC content.

RNA-Seq fastqc • 1.5k views
ADD COMMENT
0
Entering edit mode

If you do a google image search with fastqc per base sequence content you will see examples of similar profiles.

ADD REPLY
2
Entering edit mode
5.1 years ago

I noticed that before, but not with RNAseq. It was a kind of artificial set up, where the sequence of interest was exactly 42 bases long, and so we sequenced for just 43 bases (so that there would be 42 bases with post-phasing data). And I noticed that a lot of reads had the last letter trimmed, but only if it was expected to be an A.

I remade the fastqs by turning off adapter trimming in the sample sheet, and that made all the reads 42 bases as they should be.

We were using a pretty old version of bcl2fastq for a while, so I can't vouch for the fact that newer versions will show this same behavior.

ADD COMMENT
2
Entering edit mode
5.1 years ago

Yes, I'have seen that before as well (at least for DNA reads). It's a typical pattern for the read length +1th base, that base is near crap so it should (needs) to be removed anyway. That it's indicates it might be anything but A I'm not sure on (will need to check).

For more (related) info on this topic :

Why 50bp Illumina run produces 51bp long sequencs?


EDIT: I didn't even had to look for for an example: here is the fist random one I took from a dataset generated in 2018

enter image description here

ADD COMMENT
0
Entering edit mode

Thanks. Quality trimming didn't remove this, but I'll have a go at hard trimming the last base. Normally I'd think that hard trimming the reads was just masking a problem.

ADD REPLY
0
Entering edit mode

If I understood correctly that last (extra base) does not get trimmed because it's 'rescued' by the previous bases I'm guessing that if you would put the window size to 1 with a low Qvalues it will get trimmed.

ADD REPLY
0
Entering edit mode
3.7 years ago
adi.rotem ▴ 20

Did you trim the reads? If so, maybe you used a trimming option where adaptors of length 1 are removed (like the -O 1 option in cutadapt). If your adaptor starts with an A, this will remove all A's in the last base of the read.

ADD COMMENT

Login before adding your answer.

Traffic: 2906 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6