I'm trying to run kallisto quant
on multiple samples fastq.gz
files. But in the output
directory I see only one abundance.tsv
file along with .json
and .h5
#!/bin/bash
#SBATCH --job-name=Kallisto
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=4G
#SBATCH --time=1-00:00:00
#SBATCH --qos=1day
#SBATCH --output=kallisto_%A_%a.out
#SBATCH --error=kallisto_%A_%a.err
#SBATCH --array=1-50%10
LBID=$(head -$SLURM_ARRAY_TASK_ID path/samples.txt | tail -1)
module load kallisto/0.43.1-goolf-1.7.20
dir="/usr/allSamples"
kallisto quant -i kallisto_GRCh38.p10_gencodev27.idx -o output --rf-stranded ${dir}/$LBID.1.fastq.gz ${dir}/$LBID.2.fastq.gz
So, I have all the samples fastq.gz
files in /usr/allSamples
directory. And the samples.txt
looks like below.
samples.txt:
Sample100
Sample101
Sample103
Sample178
And the fastq.gz
files in /usr/allSamples
directory:
Sample100.1.fastq.gz
Sample100.2.fastq.gz
Sample101.1.fastq.gz
Sample101.2.fastq.gz
Sample103.1.fastq.gz
Sample103.2.fastq.gz
Sample178.1.fastq.gz
Sample178.2.fastq.gz
How to get the each abundance.tsv
, .json
and .h5
files for each sample separately and how to merge all the outputs into single file?
Any help is appreciated. thank you.
yes I know this, but not sure how can I specify same name for the output name? can you please tell me
You need to parametrise the value you pass to
-o
the same way as you parametrize the way you input the fastq files.but -o is
output
directory withabundance.tsv
,abundance.h5
andrun_info.json
output files. So, if I give$LBID
in the place ofoutput
it will giveSample100
,Sample101
directories with output files. Am I right?Since the program produces output that has identical file names each time you have no option but to segregate output of different samlpes into different directories.
You already know how to incorporate a variable part and a constant part when specifying a file name. It's the same.
If you did this by copy-pasting what someone else did, all I can say is you are going to struggle mightily if you can't learn how to understand what you are doing, so you can adapt it. No one wants to write your custom command lines for you. You have to learn the principals for yourself.