Question: Using a bash loop script to map multiple samples in PBS
0
gravatar for t86dan
21 months ago by
t86dan0
t86dan0 wrote:

I am using SSH to connect to my school's server and map a couple hundred fastq.gz samples to a reference genome. Everything is uploaded on the server and I am trying to run the following script to submit. I already tried the same script on my Ubuntu personal computer with no problems but it doesn’t seem to work on the server. Also, I am not too sure if with this script I am actually using the 8 processors and 16Gb of memory that I am assigning. Thanks in advance!

#!/bin/bash
#PBS -m abe
#PBS -N bowtie2_alignment
#PBS -l walltime=10:00:00
#PBS -l nodes=1:ppn=8:intel,mem=16gb
#PBS -j oe

#Loading Java
module load jdk/1.8.0_31
module load bowtie/2-2.2.4

#Changing to current directory
cd $PWD

#Mapping with bowtie2    
for i in $(ls *.fastq.gz | rev | cut -c 10- | rev | uniq)
do
  REFERENCE=/PATH_to_ref;
  PICARD=/PATH_to_picard.jar;
# map the reads and sort the alignment
  bowtie2 --rg-id ${i} --rg SM:${i} --rg PL:ILLUMINA -t -x ${REFERENCE} -U ${i}.fastq.gz 2> ${i}_bowtie2.log; 
done;

I get the following error:

ls: cannot access *.fastq.gz: No such file or directory

I am new to running these types of scripts on a SSH server so any help would help. Thanks

pbs alignment bowtie2 • 1.3k views
ADD COMMENTlink modified 21 months ago by h.mon24k • written 21 months ago by t86dan0

To begin with those fastq files do not appear to be in the directory where you are trying to run this script from. Easiest thing would be to change into directory with fastq files and see if you are able to run the script there.

ADD REPLYlink modified 21 months ago • written 21 months ago by genomax64k

replace $PWD for the actual folder where the fastq files are.

#Changing to current directory
#cd $PWD
cd /path/to/fastq
ADD REPLYlink modified 21 months ago • written 21 months ago by h.mon24k

Hint: What is $PWD getting set as on the cluster nodes? Are you sure it's where you're submitting the job (do an echo $PWD in the script and capture stdout)?

ADD REPLYlink written 21 months ago by Devon Ryan88k

define $PWD and make sure that .fastq.gz files are in that directory.

ADD REPLYlink written 21 months ago by cpad011211k

Do not set $PWD: $PWD is an environment variable automatically set by the shell every time you use cd, so it could lead to unexpected behaviours messing with it.

$ mkdir {foo1,foo2}
$ cd foo1
$ echo $PWD
/home/user/foo1
$ PWD=/home/user/foo3
$ echo $PWD
/home/user/foo3
$ pwd
/home/user/foo1
$ cd $PWD
bash: cd: /home/user/foo3: No such file or directory
$ cd ../foo2/
$ echo $PWD
/home/user/foo2

In my interactive shell, $PWD is used to set my prompt, so if I mess with it, my prompt gets messed as well:

hmon@desk:~$ PWD=/some/imaginary/folder
hmon@desk:/some/imaginary/folder$ pwd
/home/hmon

edit: check the wonderfully comprehensive answer from Stéphane Chazelas on this UnixExchange post, explaining $PWD and a lot more.

ADD REPLYlink modified 21 months ago • written 21 months ago by h.mon24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1456 users visited in the last hour