Question: Best RNAseq spliced aligner?
1
gravatar for Tommy Carstensen
19 months ago by
United Kingdom
Tommy Carstensen150 wrote:

I've run the RNA-seq alignment software HISAT2 on 75bp PE reads in fastq files like this:

hisat2 \
 -q \ 
 --phred33 \ 
 --n-ceil L,0,0.15 \ 
 --pen-cansplice 0 \ 
 --pen-noncansplice 12 \ 
 --pen-canintronlen G,-8,1 \ 
 --pen-noncanintronlen G,-8,1 \ 
 --min-intronlen 20 \ 
 --max-intronlen 500000 \ 
 --known-splicesite-infile Homo_sapiens.GRCh38.splicesites.tsv \ 
 --novel-splicesite-outfile out_HISAT/38.89/pass1/ERR188083/splicesites.novel.tsv \ 
 --rna-strandness FR \ 
 --mp 6,2 \ 
 --sp 2,1 \ 
 --np 1 \ 
 --rdg 5,3 \ 
 --rfg 5,3 \ 
 --score-min L,0.0,-0.2 \ 
 -k 5 \ 
 --fr \ 
 --summary-file out_HISAT/38.89/pass1/ERR188083/summary.txt \ 
 --new-summary \ 
 -p 8 \ 
 --mm \ 
 --seed 0 \ 
 --remove-chrname \ 
 -x Homo_sapiens.GRCh38 \ 
 -1 ../../../data/geuv/fastq/ERR188083_1.fastq.gz \ 
 -2 ../../../data/geuv/fastq/ERR188083_2.fastq.gz \ 
 -S out_HISAT/38.89/pass1/ERR188083/ERR188083.sam

But I get a very poor alignment:

HISAT2 summary stats:
    Total pairs: 26025190
        Aligned concordantly or discordantly 0 time: 24148025 (92.79%)
        Aligned concordantly 1 time: 1178218 (4.53%)
        Aligned concordantly >1 times: 686294 (2.64%)
        Aligned discordantly 1 time: 12653 (0.05%)
    Total unpaired reads: 48296050
        Aligned 0 time: 47600213 (98.56%)
        Aligned 1 time: 505745 (1.05%)
        Aligned >1 times: 190092 (0.39%)
    Overall alignment rate: 8.55%

I was expecting it to be better than STAR, but it seems that's not the case. What is currently considered the best RNAseq spliced aligner? The 2013 review by Engström is a bit dated now. Based on that review I would choose STAR. Is that still the consensus?

sequencing rna-seq alignment • 1.9k views
ADD COMMENTlink modified 19 months ago by genomax63k • written 19 months ago by Tommy Carstensen150
1

If you are going to perform differential expression analysis, as WouterDeCoster suggested, Salmon or Kallisto will be helpful if a reference transciptome is available. You can also continue following using HISAT2 and use all default settings, but using --dta (--downstream-transcriptome-assembly) may be helpful.

As per the manual, HISAT2 provides options for transcript assemblers (e.g., StringTie and Cufflinks) to work better with the alignment from HISAT2 (see options such as --dta and --dta-cufflinks).

There are many reviews comparing STAR with HISAT2 (latest being https://www.nature.com/articles/s41467-017-00050-4), although it deals with all kind of downstream analysis workflows possible with RNAseq data, but they have given a very nice comparison between HISAT2 and STAR based on the kind of analysis you would like to perform after alignment.

ADD REPLYlink written 19 months ago by prasundutta87330

Thanks for pointing me to that very recent review @prasundutta87! I'll have a look. I think I'll stick with STAR, because it gives me a vastly superior alignment rate compared to HISAT2; i.e. well above 90%.

ADD REPLYlink written 19 months ago by Tommy Carstensen150
2
gravatar for WouterDeCoster
19 months ago by
Belgium
WouterDeCoster37k wrote:

I think STAR is still considered an excellent choice. If a differential expression analysis is your aim you have also alignment free methods such as salmon.

ADD COMMENTlink written 19 months ago by WouterDeCoster37k

Thanks a lot @WouteDeCoster! I'll just stick with STAR then. I don't think my coverage is quite good enough for alignment/reference free approaches. Thanks!

ADD REPLYlink written 19 months ago by Tommy Carstensen150

I'm not sure if your coverage plays a role for that, but okay. It's indeed quite worrying that you align so few reads... Are the default settings as bad?

ADD REPLYlink modified 19 months ago • written 19 months ago by WouterDeCoster37k

Those are all the default settings. I just prefer writing them explicitly, so I can reproduce my results, if the defaults are changed. I stripped away all the options except for --known-splicesite-infile and still got the same results. With STAR I get an alignment rate well above 90%.

ADD REPLYlink written 19 months ago by Tommy Carstensen150

Oh, seems like nothing to worry about then ;-)

If you would like to test something new, this one looks promising: Hera: A new tool for RNA-Seq analysis

ADD REPLYlink written 19 months ago by WouterDeCoster37k

Thanks for pointing me to Hera. It sure seems to be fast. I've just asked them, if they have measures of accuracy and precision relative to STAR. That's more important to me than incremental speed gains.

ADD REPLYlink written 19 months ago by Tommy Carstensen150
2
gravatar for genomax
19 months ago by
genomax63k
United States
genomax63k wrote:

I recommend that you also give BBMap a fair shake.

BTW if you have 90%+ alignments with STAR why are you looking for something better? Most people would be perfectly happy with that result. I doubt that extra few % you may gain by another aligner will give you a drastically different DE result.

ADD COMMENTlink modified 19 months ago • written 19 months ago by genomax63k

Thanks @genomax. I'll have a look at BBMap. It seems their emphasis is on sensitivity. To me accuracy/precision is equally important.

I wonder why RNAseq aligners are being churned out, when STAR works so well. I think I'll just stick with STAR, which is tried and tested and has proved its worth.

ADD REPLYlink written 19 months ago by Tommy Carstensen150
1

I think I'll just stick with STAR, which is tried and tested and has proved its worth.

That is a personal preference and I have no argument with your choice. It is best to pick an aligner and stick with it since flip-flopping back and forth only leads to trouble.

I have no doubt that given time something else will come along that will best all these programs (like BLAT did for a very specific application compared to BLAST).

BTW: If BBMap does not allow you to tweak a particular parameter then @Brian (author of BBMap) would be likely add it in a short time (if it achieves analytical gains). He participates here regularly. There is a long thread on BBMap over at SA for your reference.

ADD REPLYlink modified 19 months ago • written 19 months ago by genomax63k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1207 users visited in the last hour