Question: CIGAR and sequence length are inconsistent after HISAT alignment
gravatar for DVA
17 months ago by
United States
DVA530 wrote:


I am trying to understand an error when running "samtools view":

Line 6040, sequence length 67 vs 76 from CIGAR
Parse error at line 6040: CIGAR and sequence length are inconsistent

I went back to check line 6040 from sam file (generated by HISAT2):

NR500449:117:H7WMXX:2:11101:6188:2761        99      10      103287073      60       3S73M   =       103287096       102     TGGCAAGAGTGAGATGGCACGCCACCTTCGGGAATACCAGGACTTGCTCAATGTCAAAAACATTGAG     AAAAAE6AA/EE/E///EEE//AEEEE/E/////E6E/EE/E<E///E//<EEE//</E//<AE6EE/E/E/EEE/    AS:i:-3 XN:i:0  XM:i:0  XO:i:0  XG:i:0 NM:i:0   MD:Z:73 YS:i:-12        YT:Z:CP NH:i:1

And lines corresponding to this alignment in R1 and R2 from fastq files:

@NR500449:117:H7WMXX:2:11101:6188:2761 1:N:0:3

@NR500449:117:H7WMXX:2:11101:6188:2761 2:N:0:3

Indeed the CIGAR string represents 76bps, same as the seq length from fastq, but the sam line has only 67bps. Any idea about why would this happen and how to deal with it? Thank you very much.

alignment hisat rnaseq • 572 views
ADD COMMENTlink modified 17 months ago by Devon Ryan94k • written 17 months ago by DVA530
gravatar for Devon Ryan
17 months ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

Looks like a bug in hisat, please report it to the authors.

ADD COMMENTlink written 17 months ago by Devon Ryan94k

Thank you for the conformation. I will update this post once I hear from them.

ADD REPLYlink written 17 months ago by DVA530
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1004 users visited in the last hour