Super basic question, probably more related to programming than RNA-seq itself...
I have multiple FASTQs that I want to align (from 3 separate lanes to provide more sequencing depth). According to the STAR manual, I should be able to separate these with commas in between all read 1s, then space, then commas between all read 2s. Like this: L1_R1,L2_R1,L3_R1 L1_R2,L2_R2,L3_R2
I keep getting the error: gzip: /Sample_1/fastq/*L003_001.R2.fastq.gz: No such file or directory
When I bring it out to the terminal to check, this is what I see:
echo $FASTQ1 $FASTQ2 /Sample_1/fastq/Sample_1_TTAGGC_BCCRUJANXX_L003_001.R1.fastq.gz /Sample_1/fastq/Sample_1_TTAGGC_BCCRUJANXX_L003_001.R2.fastq.gz
echo $FASTQ1,$FASTQ2 /Sample_1/fastq/*L003_001.R1.fastq.gz,/Sample_1/fastq/*L003_001.R2.fastq.gz
So clearly it recognizes the variable name, but can't deal with the comma in between. Is there a way around this or do I need to provide full variable names in the STAR script (which would be pretty annoying, since I have many files like this)?
Thanks! Here's the code for reference (I've modified it slightly since the whole path names are long):
IN=/Sample_1 genomeDir=/STAR GTF=/gencode.v19.chr_patch_hapl_scaff.annotation.gtf FASTQ1=$IN/fastq/*L003_001.R1.fastq.gz FASTQ2=$IN/fastq/*L003_001.R2.fastq.gz FASTQ3=$IN/fastq/*L004_001.R1.fastq.gz FASTQ4=$IN/fastq/*L004_001.R2.fastq.gz FASTQ5=$IN/fastq/*L005_001.R1.fastq.gz FASTQ6=$IN/fastq/*L005_001.R2.fastq.gz STAR --runThreadN 8 --genomeDir $genomeDir --sjdbGTFfile $GTF --sjdbOverhang 149 --bamRemoveDuplicatesType UniqueIdentical --readFilesIn $FASTQ1,$FASTQ3,$FASTQ5 $FASTQ2,$FASTQ4,$FASTQ6 --twopassMode Basic --outSAMtype BAM SortedByCoordinate Unsorted --quantMode TranscriptomeSAM GeneCounts --readFilesCommand zcat