Changing chromosome format in bam
1
0
Entering edit mode
4.8 years ago
srhic ▴ 60

Hello,

I know there are already multiple threads on this issue and I have gone through most of them but I am new to bioinformatics and need some extra help.

I have a paired end bam file in which the chromosome names are in the format X,Y,1,2,3 ... and I need to convert them into the format chr1, chr2 etc. Based on some previous answers I tried the following code but the bam file produced is giving me multiple errors in downstream analysis so there is probably some mistake in this code:

samtools view -h file.bam |\
sed -e '/^@SQ/s/SN\:/SN\:chr/' -e '/^[^@]/s/\t/\tchr/2'|\
awk -F ' ' '$7=($7=="=" || $7=="*"?$7:sprintf("chr%s",$7))' |\
tr " " "\t"

I have also read that we can use picard tools or samtools to change the header but I am not sure what code to use with the commands for changing the header.

If anyone can guide me on how to do this with some more explanation of what the code is actually doing, I would really appreciate it.

Thanks

Edit: I also tried the code gvien here: https://josephcckuo.wordpress.com/2016/11/17/modify-chromosome-notation-in-bam-file/ but it keeps giving me the error: [W::sam_parse1] urecognized mate reference name; treated as unmapped

bam samtools picard • 1.8k views
ADD COMMENT
0
Entering edit mode

Thanks got it. Will try that.

ADD REPLY
0
Entering edit mode

I have edited the header file to include the "chr" prefix. When I use the reheader command to change the bam file I am getting this error:

Malformed key:value pair at line 1: "@HD VN:1.5 SO:coordinate

This is the first line of the header which I didn't change at all. Any ideas what might be going on?

Thanks

ADD REPLY
0
Entering edit mode

Did you edit the file on a OS other than unix and then move it back to unix? If so, you may need to fix the line endings by doing dos2unix header.sam.

ADD REPLY
0
Entering edit mode

Yes I had edited it in windows but dos2unix still gives the same error. I will try editing it in unix.

ADD REPLY
0
Entering edit mode

Wasn't able to reply because of posting limits for new users but editing the header manually in unix worked perfectly. Thanks!

ADD REPLY
1
Entering edit mode
4.8 years ago
GenoMax 141k

Instead of doing this use samtools reheader with a modification of the header you need: C: How to change a BAM file so the chromosome identifier is "chr 1" not just "1"

ADD COMMENT
0
Entering edit mode

Thanks I had seen this thread before and this maybe a very basic question but I dont understand how to use the samtools reheader command. I will supply the command with the input bam file but what is the input sam file with which it is replacing the header?

ADD REPLY
0
Entering edit mode

First grab the header from file you have.

samtools view -H yourfile.bam > header.sam

Edit the header.sam file the way you want them to be. Then use edited header.sam file with the reheader command in thread above.

ADD REPLY
0
Entering edit mode

Hi,

I am trying to add the "chr" prefix to the chromosome numbers in a bam file. The advice in this post seems very logical (1. samtools view -H yourfile.bam > header.sam; 2. edit the header.sam file the way you want it to be; 3. samtools reheader header.sam yourfile.bam). However, the third step doesn't work. I get some lines with binary symbols and then the following error message: [E::bgzf_flush] File write failed (wrong size) samtools reheader: Couldn't write header: Input/output error

How can I solve this? I came across a related post when doing a search with the keywords "samtool reheader File write failed" (see https://github.com/ChangLab/FAST-iCLIP/issues/32), but the problem was not solved in that post.

Could someone advice me on how to prevent the error "[E::bgzf_flush] File write failed (wrong size)" when running the command "samtools reheader header.sam filename.bam"?

Or - if this problem cannot be solved - is there another way to add the "chr" prefix to the chromosome indices in a bam file?

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2790 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6