Question: De novo transcriptome assembly: Low concordant/discordant reads, High overall allignment
gravatar for willthompson131
7 months ago by
willthompson1310 wrote:

Hey everybody,

I'm pretty new to the RNA-seq world but I've got a troubleshooting question I hoped someone might be able to help with this.

I made a de novo assembly in Trinity and wanted to run some QA checks on that assembly using Bowtie 2. When I map my reads used to construct the assembly back to the assembly, I get this.

96120217 reads; of these:
96120217 (100.00%) were paired; of these:
95910642 (99.78%) aligned concordantly 0 times
51816 (0.05%) aligned concordantly exactly 1 time
157759 (0.16%) aligned concordantly >1 times
    95910642 pairs aligned concordantly 0 times; of these:
      8609030 (8.98%) aligned discordantly 1 time
    87301612 pairs aligned 0 times concordantly or discordantly; of these:
      174603224 mates make up the pairs; of these:
        27478994 (15.74%) aligned 0 times
        38739984 (22.19%) aligned exactly 1 time
        108384246 (62.07%) aligned >1 times
85.71% overall alignment rate

Obviously, the 99.78% not aligned concordantly is not what I had expected. Does anyone have some likely explanations?

Thank you so much in advance for the help.

rna-seq • 357 views
ADD COMMENTlink modified 7 months ago • written 7 months ago by willthompson1310

What's the organism and the assembled transcriptome quality (length distribution etc.)?

ADD REPLYlink written 7 months ago by Asaf6.1k

...and what is the read length of the sequencing experiment?

ADD REPLYlink written 7 months ago by ATpoint23k

The organism is a salamander, Ambystoma opacum.

Read length is 2 x 75 bp

Counts of transcripts, etc. Total trinity 'genes': 150490 Total trinity transcripts: 244195 Percent GC: 46.48

Stats based on ALL transcript contigs:

    Contig N10: 7579
    Contig N20: 5446
    Contig N30: 4175
    Contig N40: 3095
    Contig N50: 2141

    Median contig length: 321
    Average contig: 785.91
    Total assembled bases: 191916257

Stats based on ONLY LONGEST ISOFORM per 'GENE':

    Contig N10: 6809
    Contig N20: 4819
    Contig N30: 3513
    Contig N40: 2453
    Contig N50: 1478

    Median contig length: 302
    Average contig: 678.32
    Total assembled bases: 102080495
ADD REPLYlink modified 7 months ago • written 7 months ago by willthompson1310
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2208 users visited in the last hour