Question: converting sam to bam format samtools
gravatar for Hanan
3.5 years ago by
Tel Aviv University
Hanan70 wrote:

this is my first time runing bwa and samtools. i am not so similar with the terminology.
I have generated sam file with the command:
bwa sampe ref.fa aln_sa1.sai aln_sa2.sai read1.fq read2.fq > aln-pe.sam
the read files are actually files that were processed by the servicing lab and have only unique reads. they look like this:

>@SBS123:70:B0563ABXX:3:1101:1246:2041 1:N:0:GCCAAT


the sam file head looks like this:

@SQ     SN:UCW_Tu-k45_contig_21575;tu-k61_contig_17060  LN:1805
@SQ     SN:UCW_Tu-k25_contig_8865;tu-k31_contig_8795    LN:3898

when i run:
samtools view -bS aln.sam > aln.bam
I get this message:

[sam_header_line_parse] expected '@XY', got [@SBS123:70:B0563ABXX:3:1101:1206:2056    83    UCW_Tu-k41_contig_1687    3010    60    101M    =    2977    -134    CTGATGACCCATGTGTGATGAACAGCTACCTCTTTGTTCAAGAGCTAGTGATGGAGCTAGAAATGTCNACTTGTAGTGTAAGTTGGCAAAATGACTGCTTN    *    XT:A:U    NM:i:5    SM:i:25    AM:i:25    X0:i:1    X1:i:0    XM:i:5    XO:i:0    XG:i:0    MD:Z:1C8T56C2A29T0]
Hint: The header tags must be tab-separated.
[samopen] no @SQ lines in the header.
[sam_read1] missing header? Abort!


Any suggestions are welcomed
Thank you

sam samtools bam • 1.6k views
ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Hanan70

can you please show us the output of 'head -n 20 aln-pe.sam'

ADD REPLYlink written 3.5 years ago by Pierre Lindenbaum109k

Thank you for the quick response. The next 20 lines will look the same as the first two. I forgot to say this is Rseq analysis against 88000 transcriptome contigs.

ADD REPLYlink written 3.5 years ago by Hanan70


The problem was solved by removing the @ signs from the reads in the aln.sam file. I did it with the command

sed -i 's/@SBS/SBS/g' aln.sam


Thank you for your help
ADD REPLYlink written 3.5 years ago by Hanan70
gravatar for Devon Ryan
3.5 years ago by
Devon Ryan81k
Freiburg, Germany
Devon Ryan81k wrote:

It looks like your aligner didn't remove the @ signs from in front of the read names! That'll cause the problem you're seeing. Can you show the first read in the fastq file?

ADD COMMENTlink written 3.5 years ago by Devon Ryan81k

Thank you for the quick response. I do not have the original fastq files. I only got the processed fasta files. I can try remove the @ with sed. Thank you

ADD REPLYlink written 3.5 years ago by Hanan70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 733 users visited in the last hour