Bash loop for files in several directories
4
1
Entering edit mode
7.9 years ago
lmobuchon ▴ 40

Hi everyone, I am using bash (in which I am new) to perform a variant calling on 50 bam files. The 50 bam files are each in a different folders, example: ./file1/file1.bam ./file2/file2.bam ... I would like to create a loop to perform the variant calling in all my sample but I have no idea how to do it. I have file summing up the name of each file, for example: file1 file2 file3 ... Thank you a lot for your help, Best,

Lenha

software error • 8.0k views
ADD COMMENT
0
Entering edit mode

Thank you so much for your help. I'll try ! Thanks !

ADD REPLY
0
Entering edit mode

prog **/*.bam works in zsh. ☺

ADD REPLY
0
Entering edit mode

Learn something new every day. prog ./*/*.bam should work in Bash..

ADD REPLY
3
Entering edit mode
7.9 years ago
5heikki 11k
for FILE in $(find . -maxdepth 2 -type f -name "*.bam"); do
    program -in $FILE -out $FILE.out
done

Or you could just pipe the find to xargs. Or you could define a function and pipe the find to that through GNU parallel.

ADD COMMENT
2
Entering edit mode
7.9 years ago
Chun-Jie Liu ▴ 280

First, you should be sure that each bam file has corresponding index bai file in the same directory.

# Your bam folder directory
inDir='/directory_bam_folder'

# Find all bams in the directory, and return array of bam with absolute path.
allBams=(`find $inDir -name "*bam" -type f `)

# Then use for to make calling
for bam in ${allBams[@]}
do
    callVariant $bam
done

Here is a simple example for handling bam files in parallel.

ADD COMMENT
1
Entering edit mode
7.9 years ago
ole.tange ★ 4.5k

Using GNU Parallel:

parallel variant_calling_cmd {} ::: */*.bam

Learn more:

Gnu Parallel - Parallelize Serial Command Line Programs Without Changing Them

ADD COMMENT
0
Entering edit mode
7.9 years ago
st.ph.n ★ 2.7k

Each file prefix is also the name of the folder? What is the variant calling command you are using? You could place them all into one folder, and run something like this:

for file in *.bam; do variant_calling_cmd $file; done

Or, change your file summing up the prefixes (input_vars.txt), and create a bash script that enters the directory, and runs the command.

#!/usr/bin/bash
cd /path/to/$1/
variant_calling_cmd "$1".bam

and save as variant_call_all.sh, and pass the prefix to the script using xargs:

cat input_vars.txt | xargs -n 1 bash variant_call_all.sh

If there are flags or parameters to the variant calling program you're using, you will need to add them.

ADD COMMENT

Login before adding your answer.

Traffic: 804 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6