Question: Low % of aligned reads
0
gravatar for kml
13 months ago by
kml20
kml20 wrote:

Hi, I'm new to RNAseq analysis and recently started analyzing my RNAseq data (illumina, pair end) using the Galaxy interface. After FastQC and Trim Galore, I'm at the alignment step. I have tried several tools for that - Bowtie2, HISAT2, TopHat - and both the draft genome that is available and transcriptome, but for half of my samples the % of aligned reads are very low: 20% at best (the other half 85% at best). (I'm working on a unicellular algae that its transcriptome and genome were published just a few years ago). I'm not sure if it is better to use the genome or transcriptome to align against, and whether the alignment settings are ok (I just used the default of them all). I would appreciate any advice/ tip / suggestion at this stage... Thank you all

bowtie rna-seq galaxy alignment • 427 views
ADD COMMENTlink modified 13 months ago by Vijay Lakhujani4.2k • written 13 months ago by kml20

Please ask http://biostar.usegalaxy.org/

ADD REPLYlink written 13 months ago by RamRS22k
0
gravatar for Vijay Lakhujani
13 months ago by
Vijay Lakhujani4.2k
India
Vijay Lakhujani4.2k wrote:

but for half of my samples the % of aligned reads are very low: 20% at best (the other half 85% at best).

Potential contamination. Check the number of multi-mapped reads.

I'm not sure if it is better to use the genome or transcriptome to align against, and whether the alignment settings are ok (I just used the default of them all)

Aligning with a genome will allow you to identify novel isoforms. Also, you will require a splice aware aligner like HISAT2 or STAR. With transcriptome, you can quantify pre-determined (known transcripts). All depends on what do you want to know and whether a good quality genome or transcriptome is available. In this case, you can use aligners like Bowtie.

ADD COMMENTlink modified 13 months ago • written 13 months ago by Vijay Lakhujani4.2k

Thank you very much for your comment. do you mean by the number of "multi-mapped reads", the number of reads aligned more than once? so in my case (see please below the output alignment summary) it is 2.02% / 1.73% ? i'm not sure I'm following, does this support that there is a contamination or not?

[samopen] SAM header is present: 51199 sequences. 40670710 reads; of these: 40670710 (100.00%) were paired; of these: 35574195 (87.47%) aligned concordantly 0 times 4275704 (10.51%) aligned concordantly exactly 1 time 820811 (2.02%) aligned concordantly >1 times ---- 35574195 pairs aligned concordantly 0 times; of these: 2126754 (5.98%) aligned discordantly 1 time ---- 33447441 pairs aligned 0 times concordantly or discordantly; of these: 66894882 mates make up the pairs; of these: 65475913 (97.88%) aligned 0 times 259523 (0.39%) aligned exactly 1 time 1159446 (1.73%) aligned >1 times 19.50% overall alignment rate [bam_sort_core] merging from 39 files...

ADD REPLYlink modified 13 months ago • written 13 months ago by kml20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1484 users visited in the last hour