Best RNAseq spliced aligner?
2
1
Entering edit mode
6.7 years ago

I've run the RNA-seq alignment software HISAT2 on 75bp PE reads in fastq files like this:

hisat2 \
 -q \ 
 --phred33 \ 
 --n-ceil L,0,0.15 \ 
 --pen-cansplice 0 \ 
 --pen-noncansplice 12 \ 
 --pen-canintronlen G,-8,1 \ 
 --pen-noncanintronlen G,-8,1 \ 
 --min-intronlen 20 \ 
 --max-intronlen 500000 \ 
 --known-splicesite-infile Homo_sapiens.GRCh38.splicesites.tsv \ 
 --novel-splicesite-outfile out_HISAT/38.89/pass1/ERR188083/splicesites.novel.tsv \ 
 --rna-strandness FR \ 
 --mp 6,2 \ 
 --sp 2,1 \ 
 --np 1 \ 
 --rdg 5,3 \ 
 --rfg 5,3 \ 
 --score-min L,0.0,-0.2 \ 
 -k 5 \ 
 --fr \ 
 --summary-file out_HISAT/38.89/pass1/ERR188083/summary.txt \ 
 --new-summary \ 
 -p 8 \ 
 --mm \ 
 --seed 0 \ 
 --remove-chrname \ 
 -x Homo_sapiens.GRCh38 \ 
 -1 ../../../data/geuv/fastq/ERR188083_1.fastq.gz \ 
 -2 ../../../data/geuv/fastq/ERR188083_2.fastq.gz \ 
 -S out_HISAT/38.89/pass1/ERR188083/ERR188083.sam

But I get a very poor alignment:

HISAT2 summary stats:
    Total pairs: 26025190
        Aligned concordantly or discordantly 0 time: 24148025 (92.79%)
        Aligned concordantly 1 time: 1178218 (4.53%)
        Aligned concordantly >1 times: 686294 (2.64%)
        Aligned discordantly 1 time: 12653 (0.05%)
    Total unpaired reads: 48296050
        Aligned 0 time: 47600213 (98.56%)
        Aligned 1 time: 505745 (1.05%)
        Aligned >1 times: 190092 (0.39%)
    Overall alignment rate: 8.55%

I was expecting it to be better than STAR, but it seems that's not the case. What is currently considered the best RNAseq spliced aligner? The 2013 review by Engström is a bit dated now. Based on that review I would choose STAR. Is that still the consensus?

RNA-Seq alignment sequencing • 4.5k views
ADD COMMENT
1
Entering edit mode

If you are going to perform differential expression analysis, as WouterDeCoster suggested, Salmon or Kallisto will be helpful if a reference transciptome is available. You can also continue following using HISAT2 and use all default settings, but using --dta (--downstream-transcriptome-assembly) may be helpful.

As per the manual, HISAT2 provides options for transcript assemblers (e.g., StringTie and Cufflinks) to work better with the alignment from HISAT2 (see options such as --dta and --dta-cufflinks).

There are many reviews comparing STAR with HISAT2 (latest being https://www.nature.com/articles/s41467-017-00050-4), although it deals with all kind of downstream analysis workflows possible with RNAseq data, but they have given a very nice comparison between HISAT2 and STAR based on the kind of analysis you would like to perform after alignment.

ADD REPLY
0
Entering edit mode

Thanks for pointing me to that very recent review @prasundutta87! I'll have a look. I think I'll stick with STAR, because it gives me a vastly superior alignment rate compared to HISAT2; i.e. well above 90%.

ADD REPLY
2
Entering edit mode
6.7 years ago

I think STAR is still considered an excellent choice. If a differential expression analysis is your aim you have also alignment free methods such as salmon.

ADD COMMENT
0
Entering edit mode

Thanks a lot @WouteDeCoster! I'll just stick with STAR then. I don't think my coverage is quite good enough for alignment/reference free approaches. Thanks!

ADD REPLY
0
Entering edit mode

I'm not sure if your coverage plays a role for that, but okay. It's indeed quite worrying that you align so few reads... Are the default settings as bad?

ADD REPLY
0
Entering edit mode

Those are all the default settings. I just prefer writing them explicitly, so I can reproduce my results, if the defaults are changed. I stripped away all the options except for --known-splicesite-infile and still got the same results. With STAR I get an alignment rate well above 90%.

ADD REPLY
0
Entering edit mode

Oh, seems like nothing to worry about then ;-)

If you would like to test something new, this one looks promising: Hera: A new tool for RNA-Seq analysis

ADD REPLY
0
Entering edit mode

Thanks for pointing me to Hera. It sure seems to be fast. I've just asked them, if they have measures of accuracy and precision relative to STAR. That's more important to me than incremental speed gains.

ADD REPLY
2
Entering edit mode
6.7 years ago
GenoMax 141k

I recommend that you also give BBMap a fair shake.

BTW if you have 90%+ alignments with STAR why are you looking for something better? Most people would be perfectly happy with that result. I doubt that extra few % you may gain by another aligner will give you a drastically different DE result.

ADD COMMENT
0
Entering edit mode

Thanks @genomax. I'll have a look at BBMap. It seems their emphasis is on sensitivity. To me accuracy/precision is equally important.

I wonder why RNAseq aligners are being churned out, when STAR works so well. I think I'll just stick with STAR, which is tried and tested and has proved its worth.

ADD REPLY
1
Entering edit mode

I think I'll just stick with STAR, which is tried and tested and has proved its worth.

That is a personal preference and I have no argument with your choice. It is best to pick an aligner and stick with it since flip-flopping back and forth only leads to trouble.

I have no doubt that given time something else will come along that will best all these programs (like BLAT did for a very specific application compared to BLAST).

BTW: If BBMap does not allow you to tweak a particular parameter then @Brian (author of BBMap) would be likely add it in a short time (if it achieves analytical gains). He participates here regularly. There is a long thread on BBMap over at SA for your reference.

ADD REPLY

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6