Question: Abnormal per sequence GC content and weird sequences in bacterial RNA-seq
0
gravatar for ikangkim
12 months ago by
ikangkim50
ikangkim50 wrote:

Hi,

I'm analyzing my first RNA-seq results that were obtained from a bacterial strain at two conditions: control vs. stress.

Because the bowtie2 mapping ratio of stress condition was much lower (57%) compared to control (86%), I looked into the FastQC results and also manually checked sequences.

This is the per sequence GC content plot of control condition. I think there is no problem, with a peak around the genomic GC content (52%). Control

But, the per sequence GC content of stress condition seems to be very abnormal. There is a very thick tail toward low GC. Stress

Also, when I checked the sequences from the stress condition, I found that many sequences look very weird like below.

A00718:140:HWK2FDSXX:3:1414:5059:20807 1:N:0:CCTATGCC+CAAGCTTA CAAAAAAAAAGAACAAGCAAAAGAACACAACAAAAAAAAAAAAAACAAAAAAAAAAAAACAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAT A00718:140:HWK2FDSXX:3:1561:21287:5572 1:N:0:CCTATGCC+CAAGCTTA GTTTTTTTTGTCTTTTGTGTTTTGTTGTGGTTGGTTTGCTTTGTTGTTTTTGTGGGTTGGTTGTTTTTTTTTTTGTTGTTTGTTTTTTTTTTTTTTTGGTT A00718:140:HWK2FDSXX:3:1436:24948:22310 1:N:0:CCTATGCC+CAAGCTTA CAGCACCACACAAGCAGACCCCTGCGCACAAACACGAAACCCACCCGCCCGGGCCCCCGCGCCCGCCGGGGGGGGGGGGGGGGGGGCGGGGGGGGGGGGGC A00718:140:HWK2FDSXX:3:1337:16532:12273 1:N:0:CCTATGCC+CAAGCTTA CACCAACAAAAAAAAACCAAAACACAAAAAACCAAAACCAAAAAACAAAAACAAAAAAAAAAAACCAAAAAAAAAAAAACAACAAAAAAACAAAAGAACAA

Note that the above plots and sequences were obtained after trimming (by bbduk).

What might go wrong in the stress condition?

Thanks.

sequencing rna-seq • 245 views
ADD COMMENTlink written 12 months ago by ikangkim50

Maybe do some taxonomic analysis of your data (blobtools, kraken, centrifuge). The low mapping rate is concerning.

ADD REPLYlink written 12 months ago by cschu1812.6k

Thanks for your suggestion.

Because I came to know that ~10% of reads at the stress condition were mapped onto rRNA genes (despite rRNA depletion), I performed taxonomic classification of randomly-sampled reads against Silva SSU rRNA database using classify.seqs command available in Mothur.

The result showed that there was no or little contamination.

Thanks.

ADD REPLYlink written 11 months ago by ikangkim50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1160 users visited in the last hour
_