I have four files from paired end reads: SRX10603399_SRR14240730_1.fastq.gz,SRX10603399_SRR14240730_2.fastq.gz,SRX10603417_SRR14240748_1.fastq.gz, and SRX10603417_SRR14240748_2.fastq.gz. I want to use #STAR aligner to align the four files and get two bam files. The code I have is producing four bam files. The following is my code:
module load software/star-2.7.9a
# define variables
index=/scratch/oknjav001/sarsCovRNA/star_index
# get our data files
FILES=/scratch/oknjav001/sarsCovRNA/pbmcs_healthyvscovid/pbmcs/fastq/*.fastq.gz
for f in $FILES
do
echo $f
base=$(basename $f .fastq.gz)
echo $base
STAR --runThreadN 3 --genomeDir $index --readFilesIn $f --outSAMtype BAM SortedByCoordinate --outTmpDir /scratch/oknjav001/sarsCovRNA/tempalign --quantMode GeneCounts
--readFilesCommand zcat --outFileNamePrefix $base"_"
done
echo "done!"
What is the problem here?
I want to pass in
_R1.fq.gzand_R2.fq.gzto get one combined bam file from theforwardandreversereads. I want to do this in a loop.You are getting 4 bam files because you are running STAR 4 times, once per
fastqfile. This is due to$FILESbeing an array of 4 differentfastqfilenames.