Question: Convert an altered sam file into a bam file
gravatar for Jautis
2.6 years ago by
United States
Jautis290 wrote:

Hi, I have a sam file from BSMap which I have modified into the format of a sam file from bismark.

However, after doing so, I'm no longer able to convert the sam file into a bam file using samtools. What is it that I'm missing? The more general version of this question would be when are you able to use samtools to convert sam-to-bam and what sam formats are acceptable

Thanks in advance!


#reorder columns
awk '{print $1 "\t" $13 "\t" $3 "\t" $4 "\t" $5 "\t" $6 "\t" $7 "\t" $8 "\t" $9 "\t" $10 "\t" $11 "\t" $12}' SAM > SAM2
#reattach header (not-modified)
cat sam_head SAM2 > temp; mv temp SAM2
#attempt to convert file
samtools view -bS SAM2 > BAM

Error Message: (line 25 is the first read after the header)

[E::sam_parse1] unrecognized type
[W::sam_read1] parse error at line 25
[main_samview] truncated file.

Bam File, First 25 lines.

@HD     VN:1.0
@SQ     SN:chr4 LN:165299245
@SQ     SN:chrX LN:143131424
@SQ     SN:chr2 LN:187378091
@SQ     SN:chr6 LN:174439528
@SQ     SN:chr8 LN:139646187
@SQ     SN:chr12        LN:104110932
@SQ     SN:chr10        LN:90941950
@SQ     SN:chr14        LN:123829720
@SQ     SN:chr16        LN:74645514
@SQ     SN:chr18        LN:72186199
@SQ     SN:chr20        LN:71807805
@SQ     SN:chr1 LN:220367699
@SQ     SN:chr3 LN:180432695
@SQ     SN:chr5 LN:178775436
@SQ     SN:chr7 LN:162156779
@SQ     SN:chr9 LN:125196307
@SQ     SN:chr11        LN:132286798
@SQ     SN:chr13        LN:128036923
@SQ     SN:chr15        LN:107442819
@SQ     SN:chr17        LN:90913898
@SQ     SN:chr19        LN:51301725
@SQ     SN:mtDNA        LN:16566
@PG     ID:BSMAP        VN:2.90 CL:"bsmap -3 -n 1 -v 0.1 -r 0 -a ./tomap.fq.gz -d /file.sam"
sequencing sam bam • 1.7k views
ADD COMMENTlink modified 2.5 years ago • written 2.6 years ago by Jautis290

Speaking of this problem in particular: we'd need a print out of line 25 to understand what's wrong with it.

More generally speaking:

A SAM file (to be called as such) requires a formatted header and a series of records which have the columns defined in the file format definiton (link). If you want to be able to convert a sam to a bam, you need your file to possess these two elements. It doesn't matter if the header contains more scaffolds than the ones represented in the records, what matters is that the opposite doesn't happen: records point at chromosomes / scaffolds that are not in the header. You'll see everything in chapter 1.3 of the linked PDF.

In your case I see you're attaching the header so: are you keeping all the header when you generate it? Are your record lines all containing the same number of fields?

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Macspider3.0k

Thanks! I went ahead and added the first 25 lines to the initial question. Yes, I am keeping the same header that I initially generated. I am adding additional fields (NM, MD, XM, XR, and XG flags with dummy values)

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Jautis290

Does it print the same error if you exclude the XM tag field at the end? And if you change the read name? Maybe there are some meta-characters... Also, check if you have whitespaces!

ADD REPLYlink written 2.5 years ago by Macspider3.0k

Please post first few starting lines and end lines of your samfile so as to get clear idea of what has gone wrong.

ADD REPLYlink written 2.6 years ago by toralmanvar840
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 988 users visited in the last hour