I have a dataset of 14 million 50bp single-end reads (Illumina), and used Tophat to map them to a genome (indexed with Bowtie). I do not have the gff of the genome. When I ran it, only 50% of the reads were mapped. I have other libraries (a total of 200 million reads) of the same organism but I ran just one as a first approach.
Does anyone know what could be happening? I used the default parameters, any ideas of which parameters should I modify to obtain better results?
Thank you in advance for your help.