Help with for loop for sequence alignment using STAR
1
1
Entering edit mode
3.3 years ago

Hi, I am trying to run a sequence alignment with STAR. I have a total of 28 files paired-end files, 14 R1 and 14 R2. My files are called like this:

mapped_trimmed.LLC7b_Aligned.sortedByCoord.out.bam_R2.fq
mapped_trimmed.LLC7a_Aligned.sortedByCoord.out.bam_R2.fq
mapped_trimmed.LLC1b_Aligned.sortedByCoord.out.bam_R2.fq
mapped_trimmed.LLC1a_Aligned.sortedByCoord.out.bam_R2.fq
mapped_trimmed.LLC7b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC7a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC1b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC1a_Aligned.sortedByCoord.out.bam_R1.fqq

The code I have written to run star so far is this one:

#!/bin/bash --login
#$ -cwd
#$ -l short
#$ -pe smp.pe 12

module load apps/intel-18.0/star/2.7.2b

#get only files names
for i in *R1.fq; do name=$(basename ${i} _R1.fq);

  STAR --genomeDir /scratch/STAR_index  \ #Path to the index generated previously
   --runThreadN 12 \ #Number of cores
   --readFilesIn ${name}_R1.fq ${name}_R2.fq \ #Path to the input files (forward and reverse)
  --outFileNamePrefix ${name}_aligned_transcriptome \ #Prefix to the output files
  --outSAMtype BAM SortedByCoordinate \ 
  --limitBAMsortRAM 31000000000
done

Yet I keep getting this error

/opt/site/sge/default/spool/node403/job_scripts/1635775: line 17: syntax error near unexpected token `('
/opt/site/sge/default/spool/node403/job_scripts/1635775: line 17: `  --readFilesIn ${name}_R1.fq ${name}_R2.fq \ #Path to the input files (forward and reverse)'`

This is the output of the part of the code that should feed into line 17

for i in *R1.fq; do name=$(basename ${i} _R1.fq); echo $name
> done 
mapped_trimmed.LLC1a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC1b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC2a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC2b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC3a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC3b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC4a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC4b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC5a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC5b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC6a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC6b_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC7a_Aligned.sortedByCoord.out.bam
mapped_trimmed.LLC7b_Aligned.sortedByCoord.out.bam

And this is line 17 itself and it looks alright. So, I don't understand why this loop is not working

for i in *R1.fq; do name=$(basename ${i} _R1.fq); echo ${name}_R1.fq; done
mapped_trimmed.LLC1a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC1b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC2a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC2b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC3a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC3b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC4a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC4b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC5a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC5b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC6a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC6b_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC7a_Aligned.sortedByCoord.out.bam_R1.fq
mapped_trimmed.LLC7b_Aligned.sortedByCoord.out.bam_R1.fq

Which is exactly how the files are called in my folder.

I hope someone can help me!

STAR unix • 1.4k views
ADD COMMENT
0
Entering edit mode

You seem to be pointing to a salmon transcriptome index instead of a STAR genome index.

ADD REPLY
0
Entering edit mode

I corrected that, I can see how that would be confusing. Thanks, It really is a STAR index

ADD REPLY
0
Entering edit mode

for i in *R1.fq; do name=$(basename ${i} _R1.fq);

use a workflow manager like nextflow or snakemake

ADD REPLY
0
Entering edit mode
anscripts/Genome \ #Path to the index genera

I'm not sure bash likes the strings after a \

ADD REPLY
2
Entering edit mode
3.3 years ago
e \ #Path to the index generated previously

remove any string after a \ , even if it's a space or a comment....

ADD COMMENT

Login before adding your answer.

Traffic: 2067 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6