Per base sequence content: "A" underrepresented
1
0
Entering edit mode
4.1 years ago
Wöps ▴ 10

Dear all,

my RNA-seq reads seem to be slighly biased to show more 'T' than 'A' (see image). This is unexpected for me. Does anyone have a suggestion why that might be (and how to proceed)?

enter image description here

The data are paired-end, sequenced with Illumina HiSeq 2500. (The read direction is reverse.)

thanks for your help!

RNA-Seq fastqc • 1.6k views
ADD COMMENT
0
Entering edit mode

Your image doesn't work but also something to think about is that the most RNA-seq technologies sequence the cDNA (reverse transcribed copy) of your mRNA molecules. Thus assuming the method is strand-specific the poly-A tails will often look like stretches of T's; potentially explaining your results.

ADD REPLY
0
Entering edit mode

Fixed the link to the image. Please use the image button and paste in the full link incl. the suffix (e.g. .png) into the field that pops up:

enter image description here

ADD REPLY
1
Entering edit mode

Thank you! Will do next time.

ADD REPLY
0
Entering edit mode

Have you trimmed off poly-A sections?

ADD REPLY
0
Entering edit mode

I hadn't trimmed poly-A sections so far. I now quickly tried running afterQC (https://github.com/OpenGene/AfterQC) to remove any reads with polyX above length 20. It does not seem to change the situation though:

FastQC after removing polyX

ADD REPLY
0
Entering edit mode

As genomax says, what you see is normal. Just proceed with your analysis.

ADD REPLY
2
Entering edit mode
4.1 years ago
GenoMax 141k

This is a fairly classic example of RNAseq libraries. See this blog post from authors of FastQC for more details.

ADD COMMENT

Login before adding your answer.

Traffic: 1506 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6