Reformat sam header issue
Entering edit mode
2.3 years ago
pg_canada • 0

Hi everyone,

I'm new to this community and new to this type of analysis, so I apologize if this question seems simple.

I'm formatting some bam files in order to run them through EXCAVATOR for CNV analysis. There are a couple of bam file where I get the following error:

[E::sam_parse1] missing SAM header
[W::sam_read1] parse error at line 7
[main_samview] truncated file.

When I check the bam (samtools view -h my bam) the file does seem to have a header, as below..

@HD     VN:1.4  GO:none SO:coordinate
@SQ     SN:chr1 LN:249250621    M5:1b22b98cdeb4a9304cb5d48026a85128     UR:/mnt/
@SQ     SN:chr2 LN:243199373    M5:a0d9851da00400dec1098a9255ac712e     UR:/mnt/
@SQ     SN:chr3 LN:198022430    M5:641e4338fa8d52a5b781bd2a2c08d3c3     UR:/mnt/
@SQ     SN:chr4 LN:191154276    M5:23dccd106897542ad87d2765d28a19a1     UR:/mnt/
@SQ     SN:chr5 LN:180915260    M5:0740173db9ffd264d728f32784845cd7     UR:/mnt/
@SQ     SN:chr6 LN:171115067    M5:1d3a93a248d92a729ee764823acbbc6b     UR:/mnt/
@SQ     SN:chr7 LN:159138663    M5:618366e953d6aaad97dbe4777c29375e     UR:/mnt/
@SQ     SN:chrX LN:155270560    M5:7e0e2e580297b7764e31dbc80c2540dd     UR:/mnt/
@SQ     SN:chr8 LN:146364022    M5:96f514a9929e410c6651697bded59aec     UR:/mnt/
@SQ     SN:chr9 LN:141213431    M5:3e273117f15e0a400f01055d9f393768     UR:/mnt/
@SQ     SN:chr10        LN:135534747    M5:988c28e000e84c26d552359af1ea2e1d     
@SQ     SN:chr11        LN:135006516    M5:98c59049a2df285c76ffb1c6db8f8b96     
@SQ     SN:chr12        LN:133851895    M5:51851ac0e1a115847ad36449b0015864     
@SQ     SN:chr13        LN:115169878    M5:283f8d7892baa81b510a015719ca7b0b     
@SQ     SN:chr14        LN:107349540    M5:98f3cae32b2a2e9524bc19813927542e

etc ..

Has anyone encountered this before? Any pointers as to how I can fix this?

This is the command I'm using to reformat:

samtools view -h mybam.bam | awk 'BEGIN{FS=OFS="\t"} (/^@/ && !/@SQ/){print $0} $2~/^SN:[1-9]|^SN:X|^SN:Y|^SN:MT/{print $0}  $3~/^[1-9]|X|Y|MT/{$3="chr"$3; print $0} ' | sed 's/SN:/SN:chr/g' | sed 's/chrMT/chrM/g' | samtools view -bS -> mybam_merge_reformat.bam

Thank you

samheader • 1.1k views
Entering edit mode

what is the output of your pipeline BEFORE the last samtools view

PS: this awk might not work. You're going to add some chr to the unmapped reads, you're ignoring the mate and the 'SA' tag for supplementary alignments.


Login before adding your answer.

Traffic: 1910 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6