I am analyzing paired-end RNAseq data. I've been using TopHat to align to my reference genome. When I align using the default settings, and calculate the insert size, I get very inconsistent results and large standard deviations using various tools (Picard, Qualimap, etc.). I was hoping to be able to estimate the mean inner distance and then go back to re-run TopHat. While changing this parameter does not seem to affect the alignment much, my ultimate results using Cufflinks and Cuffdiff to find DE genes are greatly affected. Searching for an answer, I found some information suggesting that aligning to a reference genome for this calculation may be misleading because of introns, so I tried aligning to a reference transcriptome using Bowtie and calculated yet another insert size value (which results in a negative mean inner distance). This value seems to be pretty consistent among my different data sets, however. Could anyone explain to me the proper way to go about calculating the insert size for paired-end data? Is initially mapping to a reference transcriptome the way to go?