Entering edit mode
7.7 years ago
agata88
▴
870
Hi all!
I have a RNAseq project. I've performed trimming (using trimmomatic with quality threshold 15) and mapping to solanum tuberosum genome by TopHat, with defould parameters. Here I paste the summary from this tool.
This is my first RNAseq that is why I am very curious what do you think about it? Do you think that those mapping results are sufficient for further analyses? The analysis will involve: cont read number of gene, calculation of RPKM of each gene, differentially expressed genes, GO classification and KEGG classification.
Left reads:
Input: 21234359
Mapped: 15737170 (74.1% of input)
of these: 1917790 (12.2%) have multiple alignments (7210 have >20)
Right reads:
Input: 21234359
Mapped: 13864163 (65.3% of input)
of these: 1572581 (11.3%) have multiple alignments (6279 have >20)
69.7% overall read alignment rate.
Aligned pairs: 12446824
of these: 836399 ( 6.7%) have multiple alignments
and: 189037 ( 1.5%) are discordant alignments
57.7% concordant pair alignment rate.
Best,
Agata
Have you tried to figure out what the remaining ~25% of the reads are that are not mapping to potato?
Yes, I've checked that and blast tells me that those reads are Solanum lycopersicum or predicted: solanum tuberous. Those reads were discarded from alignment maybe cause of low mapping quality ... so all of reads are from potato but only 74% were mapped to Solanum tuberous reference. Do you think that those results are ok?
If 25% of your reads are BLASTing to the reference but not mapping to the reference using your mapping program, I would suggest you should use different settings or a different mapping program, to reduce potential bias. While noting that I am the author, I suggest you try mapping with BBMap instead, as it typically has a higher mapping rate compared to TopHat, when mapping RNA-seq reads to a genome. A 57% concordant pair alignment rate is pretty dismal.