Question: Reads That Are Used In De Novo Assemblies That Did Not Map To Contigs
I did de novo transcriptome assembly using RNAseq reads using an Oases assembler. It has an option to spit out the reads that were unused in the construction of contigs. Using this facility, I divided the raw read (the original fastq file) into used and unused, and used bowtie to map only used reads to the contig. However, mapping was about ~70%, and there were unmapped reads(bowtie has an options to get this)

What can account for the "reads that were used in making of contigs not mapping to the contigs"? Any advice or suggestions are greatly appreciated

See relaxing your bowtie alignment stringency: and building indexes at --offrate 1

Express suggests using bowtie2 with these options : --offrate 1 -a -X 800 --rdg 6,5 --rfg 6,5 --score-min L,-.6,-.4 --no-discordant --no-mixed

IMO, this might happen due to the error correction which assembly does - i.e. choosing a base for particular position by the majority of aligned reads - then reads with that base being different may not align back (also indels); another issue could be that reads are getting truncated within the assembly process based on the quality, but i am not an expert in Oases... You can investigate this issue further by lowering the alignment stringency when you map your reads back to assembled contigs.

