We have commissioned RNA-seq and analysis by a company, which provided us with raw fastq files, BAM files, and a count matrix. They used hard clipping and Tophat for the alignment to GRCm38/Mm10. I have attempted to recreate their analysis with HISAT2 (same reference genome), using simply the default parameters and no separate trimming/clipping. I have used samtools to convert the SAM files to BAM files and compared the results from the company's analysis ("Tophat") with my own ("HISAT2") using IGV. The results are very confusing to me. The majority of genes I have (randomly) inspected look highly similar between both sets of BAM files. See this example gene (Tophat in blue, HISAT2 in red):
So far, so good. However, there are also multiple instance where one analysis picked up good reads, while the other did not. This is true in both directions. See these two example genes:
And, finally, there are some genes in which one alignment just looks weirdly skewed. For instance:
Does anyone know what might account for these differences? Or which alignment I should use for downstream analysis? I'd be grateful for any feedback!