For loop script
1
0
Entering edit mode
6.9 years ago

Hi I am trying to run command in the for loop but it is not working. I have 49 file, of which i have to do multiplex and then align them afterwards. SO i thought it would be easy to do in for loop but i did not make loop script before, i tried to make this with help of friend but it seems there are some problem. Could someone help me to correct this? here is the script -

nohup sh -c 'for fq1 in *.fastq.gz; do
fq2=${fq1/_R1_/_R2_}
out=`basename $fq1 _L001_R1_001.fastq.gz`
echo $fq1, $fq2, $out 
if [[ ! -f "$fq1" ]]; then echo "WRONG"; break; fi
if [[ ! -f "$fq2" ]]; then echo "WRONG"; break; fi
fastq-multx -B barcode.txt $fq1 $fq2 -o ${fq1%%.fastq.gz}_%.fa.gz; done`
RNA-Seq fastq-multx • 4.0k views
ADD COMMENT
3
Entering edit mode

Hi, please next time format your code to be better readable. Did you check my question - shell script to alignment paired-end reads?

ADD REPLY
2
Entering edit mode

Its a little hard to read your script since it is not formatted properly in the post, but something that immediately jumps out is that you are calling sh shell, but you are using [[ ... ]] for test, which I believe are bash specific and not supported in sh. Is there a reason you are running the command this way and not in a regular bash script?

ADD REPLY
2
Entering edit mode

I am trying to run command in the for loop but it is not working

What error are you getting? Are you getting no output file, but it appears to run? Remove nohup, and place your code into a bash script, with it broken up to make it easier to read. A one-liner can be difficult at first, and clunky to read.

for fq1 in *.fastq.gz; do

    fq2=${fq1/_R1_/_R2_}

    out=`basename $fq1 _L001_R1_001.fastq.gz`

    echo $fq1, $fq2, $out 

    if [[ ! -f "$fq1" ]]; then 
        echo "WRONG"; break; fi

    if [[ ! -f "$fq2" ]]; then 
        echo "WRONG"; break; fi

    fastq-multx -B barcode.txt $fq1 $fq2 -o ${fq1%%.fastq.gz}_%.fa.gz; 
done
ADD REPLY
0
Entering edit mode

Error: number of input files (2) must match number of output files following '-o'.

ADD REPLY
0
Entering edit mode

Hi Steve, Thanks for comment. no reason for this, i got this script and run it. Could you suggest me to correct this?

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

ADD REPLY
0
Entering edit mode

oh i see. I am new to this but now i will try to follow that.

ADD REPLY
0
Entering edit mode

but it is not working

I see.

seems there are some problem

Aha!

It would be very helpful if you can tell us what's going wrong. An error message, or an unexpected outcome.

ADD REPLY
0
Entering edit mode

hi Paul, I read you question and tried to use the suggested answer

for i in $(ls *.fastq.gz | rev | cut -c 13- | rev | uniq)
> do
> fastq-multx -B barcode.txt ${i}_R1_001.fastq.gz ${i}_R2_001.fastq.gz -o ${i%.fastq.gz}_%.fa.gz   
> done

Error msg comes with following msg number of output should be same as input

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
3
Entering edit mode
6.9 years ago
steve ★ 3.5k

There are numerous issues with your script, you might want to spend some time checking out some of these tutorials brush some more syntax and formatting that will fix some of the problems.

I reformatted it to look a little better & should be closer to running (I can't test it myself);

#!/bin/bash

# iterating over all .fastq.gz files
for fq1 in *.fastq.gz; do
    # create fastq 2 filename by subsituting "_R1_" in the filename for "_R2_"
    fq2="${fq1/_R1_/_R2_}"

    # create an output file name by taking the basename of the input fastq file ??
    out="$(basename ${fq1}_L001_R1_001.fastq.gz)"

    # you are probably going to need to specify the output directory as well,
    # gonna assume its the same as the input directory?
    # give it a new file extension? '.demult.fastq.gz' or something like that
    output_path="$(dirname ${fq1}_L001_R1_001.fastq.gz)/${out}.demult.fastq.gz"
    echo "$fq1, $fq2, $out"

    # you are checking to make sure that both fastq1 and fastq2 exist...
    if [ ! -f "$fq1" ]; then
        echo "WRONG"
        break
    fi
    if [ ! -f "$fq2" ]; then
        echo "WRONG"
        break
    fi

    # I am not really sure what you meant by '${fq1%%.fastq.gz}_%.fa.gz' ...
    fastq-multx -B barcode.txt "$fq1" "$fq2" -o "$output_path"
done

However, there are still a lot of other issues with this script; since I love writing bash scripts, I reformatted the whole thing in a way which might work better:

#!/bin/bash

fastq_dir="$1"
output_dir="$2"
barcode_file="barcode.txt"
mkdir -p "$output_dir"

# find the R1 fastq files, e.g. 'fastq/sample1_L001_R1_001.fastq.gz'
find "$fastq_dir" -type f -name "*.fastq.gz" -name "*_R1_*" -print0 | while read -d $'\0' fastqR1; do
    fastqR1_base="$(basename "$fastqR1")" # sample1_L001_R1_001.fastq.gz

    # split the basename on the last '_R1_' and insert '_R2_'
    fastqR2_base="${fastqR1_base%_R1_*}_R2_${fastqR1_base##*_R1_}" # sample1_L001_R2_001.fastq.gz

    # make full path to the expected R2 fastq file
    fastqR2="$(dirname "$fastqR1")/${fastqR2_base}"

    # make path to output file
    output_base="${fastqR1_base%_R1_*}_${fastqR1_base##*_R1_}" # sample1_L001_001.fastq.gz
    output_path="${output_dir}/${output_base}"

    # make sure fastqR2 exists
    if [ -f "$fastqR2" ] ; then
        echo "Demultiplexing $fastqR1 and $fastqR2 with barcode file $barcode_file, output will be: $output_path"
        fastq-multx -B "$barcode_file" "$fastqR1" "$fastqR2" -o "$output_path"
    fi

done

Usage:

$ ls -1 fastq/
sample1_L001_R1_001.fastq.gz
sample1_L001_R2_001.fastq.gz
sample2_L001_R1_001.fastq.gz
sample2_L001_R2_001.fastq.gz
sample3_L001_R1_001.fastq.gz

$ ./demult2.sh fastq/ output
Demultiplexing fastq//sample1_L001_R1_001.fastq.gz and fastq/sample1_L001_R2_001.fastq.gz with barcode file barcode.txt, output will be: output/sample1_L001_001.fastq.gz
Demultiplexing fastq//sample2_L001_R1_001.fastq.gz and fastq/sample2_L001_R2_001.fastq.gz with barcode file barcode.txt, output will be: output/sample2_L001_001.fastq.gz

Hope that helps.

ADD COMMENT
0
Entering edit mode

Hi Thanks for the comment. Here i am getting the following error msg.

Error: number of input files (2) must match number of output files following '-o'.

ADD REPLY
0
Entering edit mode
for fq1 in *.fastq.gz; do
fq2="${fq1/_R1_/_R2_}"
out="$(basename ${fq1}_L001_R1_001.fastq.gz)"
output_path="$(dirname ${fq1}_L001_R1_001.fastq.gz)/${out}_%.fa.gz"
echo "$fq1, $fq2, $out" 
if [[ ! -f "$fq1" ]]; then 
echo "WRONG"
break
fi
if [[ ! -f "$fq2" ]]; then
echo "WRONG" 
break
fi
fastq-multx -B barcode.txt "$fq1" "$fq2" -o "$output_path"
done

I ran the following script u suggested

ADD REPLY
0
Entering edit mode

There is nothing wrong with the script/loop. You are just using a wrong command for fastq-multx and the error message tells you exactly what's wrong. You have two input files ($fq1 and $fq2) so you need two output files.

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
0
Entering edit mode

Thanks for comment! What is wrong with fastq-multx commond. Thanks

ADD REPLY
0
Entering edit mode

It just told you. Read the error message! It's one of the best error messages you'll ever see!

number of input files (2) must match number of output files following '-o'.

You have 2 input files. You need 2 output files. You only specified 1 output file.

ADD REPLY
0
Entering edit mode

Thanks that I got it. I thought u ment something else. Thanks

ADD REPLY

Login before adding your answer.

Traffic: 860 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6