Iterating over different variables in bash script
4
0
Entering edit mode
7.1 years ago
Fatima ▴ 1000

I have a few lists of samples that I need to use, but I don't know how to address them in bash script.

declare -a sample1=("402.fasta" "440.fasta" "410.fasta" "405.fasta") 
declare  -a sample2=("403.fasta" "360.fasta" "230.fasta" "408.fasta") 
.
.
.
 declare -a sample9=("400.fasta" "340.fasta" "360.fasta" "436.fasta")

I don't know how to iterate over different values.

I need something like

for((i=1;i<10;i++));
do
for((j=0,j<4;j++));
var=sample$i
echo ${!var[$j]}
done
done

But it only returns 402.fasta, 403.fasta,... 400.fasta, the first file in each sample set!

I have a script that I need to pass files in each set separated by comma to this script, for example

somescript 402.fasta,440.fasta,410.fasta,405.fasta

for((i=1;i<10;i++));
do 
somescript ${set$i[0]}, ${set$i[1]},${set$i[2]},${set$i[3]}              
done

but ${set$i[3]} doesn't work

And also I need to run LINE_COUNT=$(find . -type f -print0 | wc -l allfilesinset1 | grep "total") echo $LINE_COUNT

If all files in set 1 were in directory1 instead of allfilesinset1 I could use set1/*.fasta but I don't want to change the data sets or cat them together because it takes space, and also I need them individually too.

bashscript • 1.6k views
ADD COMMENT
1
Entering edit mode

I think in the over-generalization in the question, and further lack of details makes this difficult to understand. Perhaps write out some psuedocode, to better illustrate the steps you want to complete. I think your end goal is unclear.

ADD REPLY
2
Entering edit mode
7.1 years ago
st.ph.n ★ 2.7k

If you always have 4 variables per run, you can use xargs.

List your filenames in a text file:

402.fasta 440.fasta 410.fasta 405.fasta 
403.fasta 360.fasta 230.fasta 408.fasta
...

400.fasta 340.fasta 360.fasta 436.fasta

$1, $2, $3, $4 would be passed to your script, as each variable.

So, running:

cat variables.txt | xargs -n 4 bash somescript.sh

is the same as

bash somescript.sh 402.fasta 440.fasta 410.fasta 405.fasta

but it will run as many time as lines your have in your variable.txt

To get the line count, you will have to cat the files, but you don't need to put them into a new file.

cat $1 $2 $3 $4 | wc -l
ADD COMMENT
0
Entering edit mode

Thanks, from your code I wrote the file names in a text file, each sample set in one line, as you said. And used bash script.sh $(awk 'NR==$j+1' filename) to run my script :)

Also used Alex's code to access each file individually.

ADD REPLY
1
Entering edit mode

what's wrong with using Alex's response?

for SAMPLE in `seq 1 1 9`
do
    for REPLICATE in `seq 1 1 4`
    do
        wc -l value$SAMPLE$REPLICATE.fasta | grep "total"
    done
done
ADD REPLY
0
Entering edit mode

Thanks, but as I commented in Alex's post, I don't need to print value$i$j, I need to print value[$i][$j]. My filenames do not start with value, I just tried to generalize my question. Sorry for the confusion.

sample1=("filename1" "filename2"...)
sample2=("filename5" "filename6"...)
ADD REPLY
1
Entering edit mode

Post full format examples of the file names e.g. my_sample_1.fq.gz.

ADD REPLY
2
Entering edit mode
7.1 years ago
#!/bin/bash                                                                                                                                                                                             

declare -a sample1=("402.fasta" "440.fasta" "410.fasta" "405.fasta")
declare -a sample2=("403.fasta" "360.fasta" "230.fasta" "408.fasta")
declare -a samples=("sample1" "sample2")

for i in `seq 0 1 1`
do
    echo "${samples[$i]}"
    for j in `seq 0 1 3`
    do
        value=${samples[$i]}[$j]
        echo ${!value}
    done
done

Prints:

$ ./iterate_v2.sh
sample1
402.fasta
440.fasta
410.fasta
405.fasta
sample2
403.fasta
360.fasta
230.fasta
408.fasta
ADD COMMENT
2
Entering edit mode
7.1 years ago

Or if you don't want to create a third array:

#!/bin/bash

declare -a sample1=("402.fasta" "440.fasta" "410.fasta" "405.fasta")
declare -a sample2=("403.fasta" "360.fasta" "230.fasta" "408.fasta")

for i in `seq 1 1 2`
do
    sample=sample$i
    echo ${sample}
    for j in `seq 0 1 3`
    do
        value=${sample}[$j]
        echo ${!value}
    done
done

You just have to be careful about indexing, as here you're using 1-based indexing for the sample names, and 0-based indexing for referencing elements in the sample arrays. This requires different values to seq in the inner and outer loops.

In any case, both this answer and the others should demonstrate the principles you need to solve this in bash.

ADD COMMENT
2
Entering edit mode
7.1 years ago

The following script:

#!/bin/bash

for i in `seq 1 1 9`
do
    for j in `seq 1 1 4`
    do
        echo "value$i$j"
    done
done

prints:

$ ./iterate.sh
value11
value12
value13
value14
value21
...
value93
value94

If you need arrays of arrays, you're probably better off with writing a script in Perl or Python.

ADD COMMENT
0
Entering edit mode

I updated my question to resolve the confusions.

ADD REPLY
1
Entering edit mode

Please see my second and third answers elsewhere in this thread.

ADD REPLY

Login before adding your answer.

Traffic: 1423 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6