Question: Bash loop for files in several directories
0
gravatar for lmobuchon
2.3 years ago by
lmobuchon30
lmobuchon30 wrote:

Hi everyone, I am using bash (in which I am new) to perform a variant calling on 50 bam files. The 50 bam files are each in a different folders, example: ./file1/file1.bam ./file2/file2.bam ... I would like to create a loop to perform the variant calling in all my sample but I have no idea how to do it. I have file summing up the name of each file, for example: file1 file2 file3 ... Thank you a lot for your help, Best,

Lenha

software error • 2.4k views
ADD COMMENTlink modified 2.3 years ago by ole.tange3.4k • written 2.3 years ago by lmobuchon30

Thank you so much for your help. I'll try ! Thanks !

ADD REPLYlink written 2.3 years ago by lmobuchon30

prog **/*.bam works in zsh. ☺

ADD REPLYlink written 2.3 years ago by kloetzl1.0k

Learn something new every day. prog ./*/*.bam should work in Bash..

ADD REPLYlink written 2.3 years ago by 5heikki8.4k
2
gravatar for 5heikki
2.3 years ago by
5heikki8.4k
Finland
5heikki8.4k wrote:
for FILE in $(find . -maxdepth 2 -type f -name "*.bam"); do
    program -in $FILE -out $FILE.out
done

Or you could just pipe the find to xargs. Or you could define a function and pipe the find to that through GNU parallel.

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by 5heikki8.4k
2
gravatar for Chun-Jie Liu
2.3 years ago by
Chun-Jie Liu260
US, Houston
Chun-Jie Liu260 wrote:

First, you should be sure that each bam file has corresponding index bai file in the same directory.

# Your bam folder directory
inDir='/directory_bam_folder'

# Find all bams in the directory, and return array of bam with absolute path.
allBams=(`find $inDir -name "*bam" -type f `)

# Then use for to make calling
for bam in ${allBams[@]}
do
    callVariant $bam
done

Here is a simple example for handling bam files in parallel.

ADD COMMENTlink written 2.3 years ago by Chun-Jie Liu260
1
gravatar for ole.tange
2.3 years ago by
ole.tange3.4k
Denmark
ole.tange3.4k wrote:

Using GNU Parallel:

parallel variant_calling_cmd {} ::: */*.bam

Learn more:

Gnu Parallel - Parallelize Serial Command Line Programs Without Changing Them

ADD COMMENTlink written 2.3 years ago by ole.tange3.4k
0
gravatar for st.ph.n
2.3 years ago by
st.ph.n2.4k
Philadelphia, PA
st.ph.n2.4k wrote:

Each file prefix is also the name of the folder? What is the variant calling command you are using? You could place them all into one folder, and run something like this:

for file in *.bam; do variant_calling_cmd $file; done

Or, change your file summing up the prefixes (input_vars.txt), and create a bash script that enters the directory, and runs the command.

#!/usr/bin/bash
cd /path/to/$1/
variant_calling_cmd "$1".bam

and save as variant_call_all.sh, and pass the prefix to the script using xargs:

cat input_vars.txt | xargs -n 1 bash variant_call_all.sh

If there are flags or parameters to the variant calling program you're using, you will need to add them.

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by st.ph.n2.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 830 users visited in the last hour