tophat output summary
0
0
Entering edit mode
7.7 years ago
agata88 ▴ 870

Hi all!

I have a RNAseq project. I've performed trimming (using trimmomatic with quality threshold 15) and mapping to solanum tuberosum genome by TopHat, with defould parameters. Here I paste the summary from this tool.

This is my first RNAseq that is why I am very curious what do you think about it? Do you think that those mapping results are sufficient for further analyses? The analysis will involve: cont read number of gene, calculation of RPKM of each gene, differentially expressed genes, GO classification and KEGG classification.

Left reads:
               Input:  21234359
              Mapped:  15737170 (74.1% of input)
            of these:   1917790 (12.2%) have multiple alignments (7210 have >20)
Right reads:
               Input:  21234359
              Mapped:  13864163 (65.3% of input)
            of these:   1572581 (11.3%) have multiple alignments (6279 have >20)
69.7% overall read alignment rate.

Aligned pairs:  12446824
     of these:    836399 ( 6.7%) have multiple alignments
          and:    189037 ( 1.5%) are discordant alignments
57.7% concordant pair alignment rate.

Best,

Agata

RNA-Seq tophat • 2.9k views
ADD COMMENT
1
Entering edit mode

Have you tried to figure out what the remaining ~25% of the reads are that are not mapping to potato?

ADD REPLY
0
Entering edit mode

Yes, I've checked that and blast tells me that those reads are Solanum lycopersicum or predicted: solanum tuberous. Those reads were discarded from alignment maybe cause of low mapping quality ... so all of reads are from potato but only 74% were mapped to Solanum tuberous reference. Do you think that those results are ok?

ADD REPLY
1
Entering edit mode

If 25% of your reads are BLASTing to the reference but not mapping to the reference using your mapping program, I would suggest you should use different settings or a different mapping program, to reduce potential bias. While noting that I am the author, I suggest you try mapping with BBMap instead, as it typically has a higher mapping rate compared to TopHat, when mapping RNA-seq reads to a genome. A 57% concordant pair alignment rate is pretty dismal.

ADD REPLY

Login before adding your answer.

Traffic: 1778 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6