samtools merge iterate through multiple chromosomes and generates pairs
0
0
Entering edit mode
5.9 years ago
s1661753 • 0

I have .bam files from a human whole genome sequence that I've had to separate into individual chromosomes (1-22, X, Y) to make another program run. I am now wanting to merge chromosomes so that I have every paired possible combination e.g. chr1_chr2, chr1_chr3... chr1_chrY and finally chrX_chrY. At the moment, each individual chromosome and its index are in the same directory.

to make 1 pair, I know that the command is as follows (taken from samtools manual):

samtools merge output.bam input1.bam input2.bam

e.g.

samtools merge chr1_chr2.bam chr1.bam chr2.bam

But how can I iterate through every possible combination of pairs, rather than writing each line of code (~ 300 lines of samtools merge ...) I don't need pairs to repeat, for example chr1 can pair with chr2:22,X,Y but chr2 doesn't need to then merge again with chr1 (or can this not be avoided/ delete later?)

I've got as far as

for file in *_chr*.bam; do samtools merge merge_${file}.bam $file $file; done

I understand that this will find all of my chromosome .bam files, then merge but this isn't going to iterate through every combination is it?

samtools unix whole genome sequencing genome • 2.3k views
ADD COMMENT
0
Entering edit mode

use two loops instead of one....

ADD REPLY
0
Entering edit mode

thanks! Very new to unix and other computational languages... any tips on how to go about that... ? ...?

ADD REPLY
0
Entering edit mode

Why do you need pairs?

Do you realise you can use

samtools merge output.bam *.bam

And get all files merged?

ADD REPLY
0
Entering edit mode

Because a package I'm using downstream is eventually going to iterate through each pair. Alternatively I could keep the data as one large .bam file but its too large andexits before completing. This later package has supposedly been written specifically for whole genome sequencing analysis but doesn't write anything to file, rather it does everything in memory, reaches capacity then exits.

ADD REPLY

Login before adding your answer.

Traffic: 2722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6