I ran HISAT2 (index built using a transcriptome multi fasta) intending that it won't perform gapped alignment. I use following script to run HISAT:
INDEX=./indices/hisat/transcriptome FASTQ=$1 OUTPUT=./transcriptome_aligned/$2.sam ./software/hisat-0.1.6-beta/hisat \ -q \ -p 2 \ --no-spliced-alignment \ --end-to-end \ -x $INDEX \ -U $FASTQ \ -S $OUTPUT
Should I still expect gapped alignment in my SAM file? I have records like this in the SAM output.
SRR2144041.255 0 YCL025C 274 255 16M1I33M * 0 0 CAGGCTCAAGAACTAGAAAAAAAATGAAAGTTCGGACAACATAGGCGCTA CCCFFFFFHHHHHJJJJJJJJJJJJJJJIJGIIIIJJJJJIJJIIJJJHH AS:i:-8 XN:i:0 XM:i:0 XO:i:1 XG:i:1 NM:i:1 MD:Z:49 YT:Z:UU NH:i:1
This shows that HISAT2 is still performing gapped alignment even with
I'm trying to use the output SAM for
rsem-calculate-expression but it returns following error due to presence of gapped alignment:
rsem-parse-alignments ./indices/rsem/rsem ./rsem_output/sample.temp/sample ./rsem_output/sample.stat/sample ./transcriptome_aligned/sample.bam 1 -tag XM Read SRR2144041.836747: RSEM currently does not support gapped alignments, sorry! "rsem-parse-alignments ./indices/rsem/rsem ./rsem_output/sample.temp/sample ./rsem_output/sample.stat/sample ./transcriptome_aligned/sample.bam 1 -tag XM" failed! Plase check if you provide correct parameters/options for the pipeline!
How do I make sure that HISAT2 doesn't perform gapped alignment? Should I filter the output for using
grep -v XO:i:0?
I checked RSEM manual and found that in order to avoid gapped alignments using Bowtie2, RSEM uses following Bowtie2 parameters:
--sensitive --dpad 0 --gbar 99999999 --mp 1,1 --np 1 --score-min L,0,-0.1
I wonder what is the equivalent of
--gbar in HISAT2