Question: Using a bash loop script to map multiple samples in PBS
0
gravatar for t86dan
2.4 years ago by
t86dan10
t86dan10 wrote:

I am using SSH to connect to my school's server and map a couple hundred fastq.gz samples to a reference genome. Everything is uploaded on the server and I am trying to run the following script to submit. I already tried the same script on my Ubuntu personal computer with no problems but it doesn’t seem to work on the server. Also, I am not too sure if with this script I am actually using the 8 processors and 16Gb of memory that I am assigning. Thanks in advance!

#!/bin/bash
#PBS -m abe
#PBS -N bowtie2_alignment
#PBS -l walltime=10:00:00
#PBS -l nodes=1:ppn=8:intel,mem=16gb
#PBS -j oe

#Loading Java
module load jdk/1.8.0_31
module load bowtie/2-2.2.4

#Changing to current directory
cd $PWD

#Mapping with bowtie2    
for i in $(ls *.fastq.gz | rev | cut -c 10- | rev | uniq)
do
  REFERENCE=/PATH_to_ref;
  PICARD=/PATH_to_picard.jar;
# map the reads and sort the alignment
  bowtie2 --rg-id ${i} --rg SM:${i} --rg PL:ILLUMINA -t -x ${REFERENCE} -U ${i}.fastq.gz 2> ${i}_bowtie2.log; 
done;

I get the following error:

ls: cannot access *.fastq.gz: No such file or directory

I am new to running these types of scripts on a SSH server so any help would help. Thanks

pbs alignment bowtie2 • 1.8k views
ADD COMMENTlink modified 2.4 years ago by h.mon27k • written 2.4 years ago by t86dan10

To begin with those fastq files do not appear to be in the directory where you are trying to run this script from. Easiest thing would be to change into directory with fastq files and see if you are able to run the script there.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by genomax72k

replace $PWD for the actual folder where the fastq files are.

#Changing to current directory
#cd $PWD
cd /path/to/fastq
ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by h.mon27k

Hint: What is $PWD getting set as on the cluster nodes? Are you sure it's where you're submitting the job (do an echo $PWD in the script and capture stdout)?

ADD REPLYlink written 2.4 years ago by Devon Ryan92k

define $PWD and make sure that .fastq.gz files are in that directory.

ADD REPLYlink written 2.4 years ago by cpad011212k

Do not set $PWD: $PWD is an environment variable automatically set by the shell every time you use cd, so it could lead to unexpected behaviours messing with it.

$ mkdir {foo1,foo2}
$ cd foo1
$ echo $PWD
/home/user/foo1
$ PWD=/home/user/foo3
$ echo $PWD
/home/user/foo3
$ pwd
/home/user/foo1
$ cd $PWD
bash: cd: /home/user/foo3: No such file or directory
$ cd ../foo2/
$ echo $PWD
/home/user/foo2

In my interactive shell, $PWD is used to set my prompt, so if I mess with it, my prompt gets messed as well:

hmon@desk:~$ PWD=/some/imaginary/folder
hmon@desk:/some/imaginary/folder$ pwd
/home/hmon

edit: check the wonderfully comprehensive answer from Stéphane Chazelas on this UnixExchange post, explaining $PWD and a lot more.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by h.mon27k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2104 users visited in the last hour