Salmon loop on multiple samples
0
0
Entering edit mode
24 months ago
m.storti • 0

Hi everyone!

I'm entering the bioinformatics world, and I have some problems with Salmon. In particular, I have to perform salmon on 101 pair-end samples (no replicates). I could do it one by one, but I think it is prone to errors. I tried some loops that I found on the net, but none of them worked. I have the index already. Please find below some of the samples that I have to align.

R55_S1_R2_001.fastq
R56_S2_R2_001.fastq
R57_S3_R2_001.fastq
R58_S4_R2_001.fastq
R59_S5_R2_001.fastq
R60_S6_R2_001.fastq
R61_S7_R2_001.fastq
R62_S8_R2_001.fastq

Do you have any suggestions? Many thanks in advance!

Salmon • 1.3k views
ADD COMMENT
2
Entering edit mode

I tried some loops that I found on the net, but none of them worked.

show us what you tried

ADD REPLY
0
Entering edit mode

Many thanks Pierre for your answer! I tried this:

#!/bin/bash

mkdir -p out/ ;
  mkdir -p out/salmon/ ;

 paste R1.list R2.list | while read R1 R2 ;
  do
echo -e "\n" ;
echo -e "/home/mstorti/FASTQ_Rabbit/run1_2/""${R1}"",""${R2}" ;
outdir=$(echo "${R2}" | cut -f7 -d"/" | cut -f1 -d"_") ;
mkdir -p "out/salmon/""${outdir}" ;

salmon quant \
  --index "/home/mstorti/FASTQ_Rabbit/run1_2/salmon/index" \
  --l A \
  -1 "${R1}" \
  -2 "${R2}" \
  --validateMappings \
  --output="out/salmon/""${outdir}" ;

echo "--Done." ;
  done

And this one:

#!/bin/bash


for fn in /home/mstorti/FASTQ_Rabbit/run1_2/;
do
samp=`basename ${fn}`
echo "Processing sample ${samp}"
salmon quant -i /home/mstorti/FASTQ_Rabbit/run1_2/salmon/index
         -l A \
         -1 ${fn}/${samp}_R1_001.fastq \
         -2 ${fn}/${samp}_R2_001.fastq \
         -p 8 --validateMappings -o /home/mstorti/FASTQ_Rabbit/run1_2/output_1/${samp}_quant
done

When I submitted the job, the cluster killed it in 20 seconds.

ADD REPLY
1
Entering edit mode

the cluster killed it in 20 seconds.

are you using the nodes of cluster or running on a server ?

if 1) you must submit using SGE or SLURM or etc... if 2) you shouldn't run such tool on a small server. See 1

ADD REPLY
0
Entering edit mode

First of all, data seems to be in $HOME where on a cluster it typically does not belong. Is there no scratch drive to store data? Then I see uncompressed fastq. No reason for that, gzip it to save space. Also, I see no scheduler instructions in the script, like SLURM as mentioned already. If you're new to cluster computing talk to the admins or a collegaue who did it before. If you run this on the headnode then the kill is expected (and intended).

ADD REPLY
0
Entering edit mode

You are right. I'm working on a server not on a cluster. I don't know if it is small or not. I will contact the admin. Thank you

ADD REPLY

Login before adding your answer.

Traffic: 3878 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6