Abnormal per sequence GC content and weird sequences in bacterial RNA-seq
0
0
Entering edit mode
4.1 years ago
ikangkim ▴ 50

Hi,

I'm analyzing my first RNA-seq results that were obtained from a bacterial strain at two conditions: control vs. stress.

Because the bowtie2 mapping ratio of stress condition was much lower (57%) compared to control (86%), I looked into the FastQC results and also manually checked sequences.

This is the per sequence GC content plot of control condition. I think there is no problem, with a peak around the genomic GC content (52%). Control

But, the per sequence GC content of stress condition seems to be very abnormal. There is a very thick tail toward low GC. Stress

Also, when I checked the sequences from the stress condition, I found that many sequences look very weird like below.

A00718:140:HWK2FDSXX:3:1414:5059:20807 1:N:0:CCTATGCC+CAAGCTTA CAAAAAAAAAGAACAAGCAAAAGAACACAACAAAAAAAAAAAAAACAAAAAAAAAAAAACAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAT A00718:140:HWK2FDSXX:3:1561:21287:5572 1:N:0:CCTATGCC+CAAGCTTA GTTTTTTTTGTCTTTTGTGTTTTGTTGTGGTTGGTTTGCTTTGTTGTTTTTGTGGGTTGGTTGTTTTTTTTTTTGTTGTTTGTTTTTTTTTTTTTTTGGTT A00718:140:HWK2FDSXX:3:1436:24948:22310 1:N:0:CCTATGCC+CAAGCTTA CAGCACCACACAAGCAGACCCCTGCGCACAAACACGAAACCCACCCGCCCGGGCCCCCGCGCCCGCCGGGGGGGGGGGGGGGGGGGCGGGGGGGGGGGGGC A00718:140:HWK2FDSXX:3:1337:16532:12273 1:N:0:CCTATGCC+CAAGCTTA CACCAACAAAAAAAAACCAAAACACAAAAAACCAAAACCAAAAAACAAAAACAAAAAAAAAAAACCAAAAAAAAAAAAACAACAAAAAAACAAAAGAACAA

Note that the above plots and sequences were obtained after trimming (by bbduk).

What might go wrong in the stress condition?

Thanks.

RNA-Seq sequencing • 1.4k views
ADD COMMENT
0
Entering edit mode

Maybe do some taxonomic analysis of your data (blobtools, kraken, centrifuge). The low mapping rate is concerning.

ADD REPLY
0
Entering edit mode

Thanks for your suggestion.

Because I came to know that ~10% of reads at the stress condition were mapped onto rRNA genes (despite rRNA depletion), I performed taxonomic classification of randomly-sampled reads against Silva SSU rRNA database using classify.seqs command available in Mothur.

The result showed that there was no or little contamination.

Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 1511 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6