Question: Convert an altered sam file into a bam file
0
gravatar for Jautis
14 months ago by
Jautis270
United States
Jautis270 wrote:

Hi, I have a sam file from BSMap which I have modified into the format of a sam file from bismark.

However, after doing so, I'm no longer able to convert the sam file into a bam file using samtools. What is it that I'm missing? The more general version of this question would be when are you able to use samtools to convert sam-to-bam and what sam formats are acceptable

Thanks in advance!

Code

#reorder columns
awk '{print $1 "\t" $13 "\t" $3 "\t" $4 "\t" $5 "\t" $6 "\t" $7 "\t" $8 "\t" $9 "\t" $10 "\t" $11 "\t" $12}' SAM > SAM2
#reattach header (not-modified)
cat sam_head SAM2 > temp; mv temp SAM2
#attempt to convert file
samtools view -bS SAM2 > BAM

Error Message: (line 25 is the first read after the header)

[E::sam_parse1] unrecognized type
[W::sam_read1] parse error at line 25
[main_samview] truncated file.

Bam File, First 25 lines.

@HD     VN:1.0
@SQ     SN:chr4 LN:165299245
@SQ     SN:chrX LN:143131424
@SQ     SN:chr2 LN:187378091
@SQ     SN:chr6 LN:174439528
@SQ     SN:chr8 LN:139646187
@SQ     SN:chr12        LN:104110932
@SQ     SN:chr10        LN:90941950
@SQ     SN:chr14        LN:123829720
@SQ     SN:chr16        LN:74645514
@SQ     SN:chr18        LN:72186199
@SQ     SN:chr20        LN:71807805
@SQ     SN:chr1 LN:220367699
@SQ     SN:chr3 LN:180432695
@SQ     SN:chr5 LN:178775436
@SQ     SN:chr7 LN:162156779
@SQ     SN:chr9 LN:125196307
@SQ     SN:chr11        LN:132286798
@SQ     SN:chr13        LN:128036923
@SQ     SN:chr15        LN:107442819
@SQ     SN:chr17        LN:90913898
@SQ     SN:chr19        LN:51301725
@SQ     SN:mtDNA        LN:16566
@PG     ID:BSMAP        VN:2.90 CL:"bsmap -3 -n 1 -v 0.1 -r 0 -a ./tomap.fq.gz -d /file.sam"
1_7163:15-114   16      chr18   70657804        255     100M    *       0       0       ATAAATTATTATATTAATGTAAAAGTAGTAAATATTTTTGTGGTGTAGTTTGCGTGTTTGGTTTTTTTTATTATTTATTTGTGAGACGTTGATTTTCGTT    IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII    NM:i:0  MD:Z:0G1G1G4A9G13G3G0G1A2G2G2G39G2      XM:Z:h.h.h..............x....H........x...xh....x.Zx..x........Z...............................x..      XR:Z:CT  XG:Z:GA
sequencing sam bam • 921 views
ADD COMMENTlink modified 14 months ago • written 14 months ago by Jautis270
1

Speaking of this problem in particular: we'd need a print out of line 25 to understand what's wrong with it.

More generally speaking:

A SAM file (to be called as such) requires a formatted header and a series of records which have the columns defined in the file format definiton (link). If you want to be able to convert a sam to a bam, you need your file to possess these two elements. It doesn't matter if the header contains more scaffolds than the ones represented in the records, what matters is that the opposite doesn't happen: records point at chromosomes / scaffolds that are not in the header. You'll see everything in chapter 1.3 of the linked PDF.

In your case I see you're attaching the header so: are you keeping all the header when you generate it? Are your record lines all containing the same number of fields?

ADD REPLYlink modified 14 months ago • written 14 months ago by Macspider2.7k

Thanks! I went ahead and added the first 25 lines to the initial question. Yes, I am keeping the same header that I initially generated. I am adding additional fields (NM, MD, XM, XR, and XG flags with dummy values)

ADD REPLYlink modified 14 months ago • written 14 months ago by Jautis270

Does it print the same error if you exclude the XM tag field at the end? And if you change the read name? Maybe there are some meta-characters... Also, check if you have whitespaces!

ADD REPLYlink written 14 months ago by Macspider2.7k

Please post first few starting lines and end lines of your samfile so as to get clear idea of what has gone wrong.

ADD REPLYlink written 14 months ago by toralmanvar750
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 731 users visited in the last hour