Hello my bio-friends,
can anybody help me please to fix my input arguments in BASH for gatk. I am trying to write script in bash for automating processing GATK steps (local realignment around indels, BQSR and call raw variants) for all my *bam samples in current directory.
I have problem in step two - Realigning bam file - there are two input variables - for each sample is *table.list and *_raw.bam file.
When I use code like this:
echo 'Realigning step starting'
for j
in *list *bam
do java -Xmx32g -jar $gatk -T IndelRealigner -I $j -R $reference -targetIntervals $j -o ${i%.list}.realignedBam.bam
done;
echo 'Realigned step is done!!!'
I have an error log: ##### ERROR MESSAGE: Couldn't read file in.bam because The interval file in.bam does not have one of the supported extensions (.bed, .list, .picard, .interval_list, or .intervals). Please rename your file with the appropriate extension.
I understand, that in one for
cycle i have just one argument j
and two in
files. Is there any idea how to load two variables (*list , *bam) in one for
cycle for each sample?
I hope my question is clear...
Thank you for any ideas and help.
Petr.
wau - it is possible please to explain what is happen here : y=$(echo $i|sed 's/bam/list/g'); ?? thank you!
It finds the string "bam" and replaces it with "list" in the
$i
variable. The alternative approach would be to just strip the extension off:I have edited my original post
Thank you guys!! Thank works perfectly fine for me!!! Just add semicolon between
do list_name=${j%.bam}**;** java -Xmx32g -.......
hmm, do you really need
g
option insed
? It should work fine without it.I know, it doesn't make much sense here, but got used to using
g
in almost all the cases,can't help it :)