Comparison of log file : STAR vs Hisat
3.4 years ago
XBria ▴ 70

Hello,

I am going to draw a table of comparison between STAR and Hisat output. How to put them in one common table while they represent different features ?! example of STAR : Started job on | Nov 29 12:19:49 Started mapping on | Nov 29 12:19:53 Finished on | Nov 29 12:22:38 Mapping speed, Million of reads per hour | 28.83

                      Number of input reads |   1321477
Average input read length |   152
Uniquely mapped reads number |   1281331
Uniquely mapped reads % |   96.96%
Average mapped length |   151.01
Number of splices: Total |   701317
Number of splices: Annotated (sjdb) |   693405
Number of splices: GT/AG |   697677
Number of splices: GC/AG |   1967
Number of splices: AT/AC |   703
Number of splices: Non-canonical |   970
Mismatch rate per base, % |   0.43%
Deletion rate per base |   0.01%
Deletion average length |   1.53
Insertion rate per base |   0.01%
Insertion average length |   1.29
Number of reads mapped to multiple loci |   16028
% of reads mapped to multiple loci |   1.21%
Number of reads mapped to too many loci |   285
% of reads mapped to too many loci |   0.02%
% of reads unmapped: too many mismatches |   0.00%
% of reads unmapped: too short |   1.80%
% of reads unmapped: other |   0.01%
Number of chimeric reads |   0
% of chimeric reads |   0.00%


Hisat output :

1321477 (100.00%) were paired; of these:

108522 (8.21%) aligned concordantly 0 times

1042850 (78.92%) aligned concordantly exactly 1 time

170105 (12.87%) aligned concordantly >1 times
----
108522 pairs aligned concordantly 0 times; of these:

4952 (4.56%) aligned discordantly 1 time
----
103570 pairs aligned 0 times concordantly or discordantly; of these:

207140 mates make up the pairs; of these:

99460 (48.02%) aligned 0 times

82638 (39.89%) aligned exactly 1 time

25042 (12.09%) aligned >1 times


96.24% overall alignment rate

At face value the STAR results look so much better that I don't think I'd bother making a table of things. If the STAR results are correct then that's the only thing that matters (this would be what every published comparison I've seen has indicated).

Could you please let me know that how you would say if STAR results seem much better ?

STAR has a 97% unique alignment rate, hisat2 is showing closer to 80% for that.

what the article says is other than what you mention.

"Because we prepared the data for this protocol by aligning all reads in the initial data sets to the whole genome and then extracting only those reads that aligned to chromosome X and their mates, we expect a mapping rate close to 100% for the reads in our reduced data set."

I think STAR is 96.96 and hisat 96.24 overall alignment rate. I hope I am right.

The article doesn't mention anything remotely related to what you're talking about. As I wrote, STAR has a unique alignment rate or ~97% and hisat2 has a unique alignment rate of ~80%. hisat2 does not have a unique alignment rate of 96% (unsurprisingly, STAR essentially always out performs hisat2 in comparisons).

Are you sure they were aligned to the same exact reference in comparable ways? Hisat is counting 10x as many reads aligning to multiple loci.