I have been trying to align NGS reads to a reference genome using HiSat2. The alignment runs okay, but I run into problem when I try to convert the SAM file into BAM file (using Samtools):
[E::sam_parse1] no CIGAR operations
[W::sam_read1] parse error at line 47
[bam_sort_core] truncated file. Aborting.
The following is line 47 and it seems like the CIGAR string is missing.
A00358:36:H5MFNDMXX:1:1101:11279:1047 147 chr6B_part1 62835665 60 150549 -529 TCATCTTGTGCTCATGATCTCAATCACCGAAGCATCGTCATGATCTCCATCATCACCGGGGCAACACCTTGATCTCCATCGTAGCATCGTTGTCGTCTCGCCAAATATTGTTACTACGACGATCGCTGGCGCTTAGTGATAAAGTAAAAC :FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFF,FFFFFFFFFFFFFF:FF:FFFFFFF,FFFFFFF:FFFFFFFFFFF:FFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFF AS:i:-8 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:120T7T21 YS:i:-11YT:Z:CP NH:i:1
This problem happens again and again and I have noticed that similar problem seem to occur in Bowtie2 when many threads are used (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) The problem disappear when I only run the alignment with one thread in HiSat2 but then the alignment will end up taking a few days. Did anyone else run into similar problem? My hisat2 code
srun hisat2 \
-p 10 \
-x $Index \
-1 $Reads/27_f1.fq.gz \
-2 $Reads/27_r2.fq.gz \
-S $Home/result.sam
Please add samtools command line. I would simply realign it and pipe the output directly into samtools like:
hisat2 (...options) ... | samtools view -o out.bam
No need to store SAM files. This is called a Unix
pipe
.