Question: Low % of aligned reads
0
gravatar for kml
2.0 years ago by
kml20
kml20 wrote:

Hi, I'm new to RNAseq analysis and recently started analyzing my RNAseq data (illumina, pair end) using the Galaxy interface. After FastQC and Trim Galore, I'm at the alignment step. I have tried several tools for that - Bowtie2, HISAT2, TopHat - and both the draft genome that is available and transcriptome, but for half of my samples the % of aligned reads are very low: 20% at best (the other half 85% at best). (I'm working on a unicellular algae that its transcriptome and genome were published just a few years ago). I'm not sure if it is better to use the genome or transcriptome to align against, and whether the alignment settings are ok (I just used the default of them all). I would appreciate any advice/ tip / suggestion at this stage... Thank you all

bowtie rna-seq galaxy alignment • 650 views
ADD COMMENTlink modified 2.0 years ago by lakhujanivijay5.0k • written 2.0 years ago by kml20

Please ask http://biostar.usegalaxy.org/

ADD REPLYlink written 2.0 years ago by RamRS27k
0
gravatar for lakhujanivijay
2.0 years ago by
lakhujanivijay5.0k
India
lakhujanivijay5.0k wrote:

but for half of my samples the % of aligned reads are very low: 20% at best (the other half 85% at best).

Potential contamination. Check the number of multi-mapped reads.

I'm not sure if it is better to use the genome or transcriptome to align against, and whether the alignment settings are ok (I just used the default of them all)

Aligning with a genome will allow you to identify novel isoforms. Also, you will require a splice aware aligner like HISAT2 or STAR. With transcriptome, you can quantify pre-determined (known transcripts). All depends on what do you want to know and whether a good quality genome or transcriptome is available. In this case, you can use aligners like Bowtie.

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by lakhujanivijay5.0k

Thank you very much for your comment. do you mean by the number of "multi-mapped reads", the number of reads aligned more than once? so in my case (see please below the output alignment summary) it is 2.02% / 1.73% ? i'm not sure I'm following, does this support that there is a contamination or not?

[samopen] SAM header is present: 51199 sequences. 40670710 reads; of these: 40670710 (100.00%) were paired; of these: 35574195 (87.47%) aligned concordantly 0 times 4275704 (10.51%) aligned concordantly exactly 1 time 820811 (2.02%) aligned concordantly >1 times ---- 35574195 pairs aligned concordantly 0 times; of these: 2126754 (5.98%) aligned discordantly 1 time ---- 33447441 pairs aligned 0 times concordantly or discordantly; of these: 66894882 mates make up the pairs; of these: 65475913 (97.88%) aligned 0 times 259523 (0.39%) aligned exactly 1 time 1159446 (1.73%) aligned >1 times 19.50% overall alignment rate [bam_sort_core] merging from 39 files...

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by kml20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1793 users visited in the last hour