Question: samtools sort truncated
gravatar for backmoons
20 months ago by
backmoons0 wrote:

Dear all,

I meet a problem when I analyzed the RNA-seq data using STAR-HESeq count pipeline, when I finished STAR, and got a bam file, I used "samtools sort" to sort the data, but I got an error:

**[E::bam_read1] CIGAR and query sequence lengths differ for GWNJ-0842:379:GW1810081505:6:2106:29579:24884
samtools sort: truncated file. Aborting**

I have 6 samples, 5 samples worked well with "samtools sort", only one this got an error, I tried to use different script parameters but still error, here is my script:

  1. genome generator

    STAR --runThreadN 16 --runMode genomeGenerate --genomeDir /home/xzm0017/Catfish/NS1809045_resequencing/clean_data/STAR3/star_index/ --genomeFastaFiles /home/xzm0017/Catfish/Channel_genome_transctipts_index/Channel_genome/0016606251ChannelCatfish_genome.fna --sjdbGTFfile /home/xzm0017/Catfish/Channel_genome_transctipts_index/Channel_genome/GCF_001660625.1_IpCoco_1.2_yulin_genomic.gtf --sjdbOverhang 149

  2. map

    STAR --runThreadN 16 --genomeDir /home/xzm0017/Catfish/NS1809045_resequencing/clean_data/STAR3/star_index/ --readFilesIn /home/xzm0017/Catfish/NS1809045_resequencing/clean_data/Chan7-1_R1_left_paired_trimmed.fq /home/xzm0017/Catfish/NS1809045_resequencing/clean_data/Chan7-1_R2_right_paired_trimmed.fq --outFileNamePrefix /home/xzm0017/Catfish/NS1809045_resequencing/clean_data/STAR2/chan71_2_mapped --limitOutSJcollapsed 5000000 --limitIObufferSize 300000000 --outSAMtype BAM Unsorted --limitBAMsortRAM 87162435271 --sjdbOverhang 149

  3. samtools sort

    samtools sort -n -T /home/xzm0017/Catfish/NS1809045_resequencing/clean_data/STAR2/tmp/ -o Catfish/NS1809045_resequencing/clean_data/STAR2/chan71_2_aftersort /home/xzm0017/Catfish/NS1809045_resequencing/clean_data/STAR2/chan71_2_mappedAligned.out.bam

It really bothered me for several days, If any of you could give me some suggestions to fix it, I will be really appreciated. Thanks in advance.

rna-seq • 1.2k views
ADD COMMENTlink modified 20 months ago by finswimmer13k • written 20 months ago by backmoons0

If you want the file sorted, why not tell STAR to sort it?

ADD REPLYlink written 20 months ago by swbarnes28.7k

Hi, Thanks for your kind reply. you mean "--outSAMtype BAM Unsorted "this parameter in STAR? yes, maybe I can( I will try it later). but I tried to don't use this parameter just now the output is sam file, then I use "samtools view" to convert the sam file to bam file, same error. What I afraid is that there is something wrong when mapping, so maybe it will still exist in next step if I don't figure out what's the problem..

ADD REPLYlink written 20 months ago by backmoons0
gravatar for Vitis
20 months ago by
New York
Vitis2.4k wrote:

Can you identify the offending read(s) with sequence length and CIGAR length? Also, please see this:

CIGAR and query sequence are of different length when trying to convert from sam to bam?

You may use Picard tools validateSAMfile to find the problematic reads. I think they may have come from some compatibility issues between your mapper and samtools.

ADD COMMENTlink modified 20 months ago • written 20 months ago by Vitis2.4k

Sorry I am very new to this, what do you mead by offending read(s)... :(

ADD REPLYlink written 20 months ago by backmoons0

The error message mentions that a read has inconsistent sequence length and CIGAR string. So if a read is 101bp long and all of it has been mapped to the genome, it will have a CIGAR string 101M, which means 101bp match. You can image different reads would have different CIGAR strings indicating their different mapping situations. So the read lengths have to be consistent with information stored in CIGAR. Basically, the error message tells you that there are reads with inconsistent length and CIGAR. You may need to identify those reads and see how those reads got mapped by the mapper (in your case STAR) and why the mapper generated such inconsistencies.

ADD REPLYlink modified 20 months ago • written 20 months ago by Vitis2.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1691 users visited in the last hour