STAR error with a loop
18 months ago
luzglongoria ▴ 50

Hi there,

I'm running STAR by using a loop:

time for file in *.fp.fq.gz; do echo ${file%%1.fp.fq.gz}2.fp.fq.gz; STAR --runThreadN 30 --genomeDir /home/path/parasite_index/ --readFilesIn $file ${file%%1.fp.fq.gz}2.fp.fq.gz --readFilesCommand zcat; done

But I get this error:

Dec 01 20:35:37 ..... started STAR run
Dec 01 20:35:37 ..... loading genome
Dec 01 20:35:40 ..... started mapping
Dec 02 20:29:31 ..... finished mapping
Dec 02 20:29:31 ..... finished successfully

EXITING: because of fatal INPUT file error: could not open read file: R13_2.fp.fq.gz2.fp.fq.gz
SOLUTION: check that this file exists and has read permission.

I am running it in the folder where the samples are. The program continues running even with the error (it jumps to the next sample). After a while I get the same error but for another sample. Not all the samples give back an error. Apparently only some of the ones that are "_2" (reverse) not the "_1" (forward).

Here it is the sample specifically. It is in the folder where I'm running the loop.

ls  R13_*
R13_1.fp.fq.gz  R13_2.fp.fq.gz

Any idea of what's going on? Thank you in advance.

STAR RNA transcriptomics
I'm running STAR by using a loop:

you should use a workflow manager: snakemake, nextflow

The program continues running even with the error (it jumps to the next sample).

see options -e , -o pipefail of bash

Apparently only some of the ones that are "_2" (reverse) not the "_1" (forward).

how about something like

ls *.fp.fq.gz | sed 's/_[12].fp.fq.gz//' | sort | uniq | while read F; do echo "run with ${F}_1.fq.gz and ${F}_2.fq.gz " ; done
Thanks Pierre but this loop does not include STAR. What's its point?

What's its point?


Look at what will happen with a simple echo.

When it's ok, replace echo with the actual star command.

If I do

ls *.fp.fq.gz | sed 's/_[12].fp.fq.gz//' | sort | uniq | while read F; do echo "run with ${F}_1.fq.gz and ${F}_2.fq.gz " ; done

Then I get

run with R1_1.fq.gz and R1_2.fq.gz 
run with R10_1.fq.gz and R10_2.fq.gz 
run with R11_1.fq.gz and R11_2.fq.gz 
run with R12_1.fq.gz and R12_2.fq.gz 
run with R13_1.fq.gz and R13_2.fq.gz 
run with R14_1.fq.gz and R14_2.fq.gz 
run with R15_1.fq.gz and R15_2.fq.gz 
run with R16_1.fq.gz and R16_2.fq.gz 
run with R17_1.fq.gz and R17_2.fq.gz 
run with R18_1.fq.gz and R18_2.fq.gz 
run with R19_1.fq.gz and R19_2.fq.gz 
run with R2_1.fq.gz and R2_2.fq.gz 
run with R20_1.fq.gz and R20_2.fq.gz 
run with R21_1.fq.gz and R21_2.fq.gz 
run with R22_1.fq.gz and R22_2.fq.gz 
run with R23_1.fq.gz and R23_2.fq.gz 
run with R24_1.fq.gz and R24_2.fq.gz 
run with R25_1.fq.gz and R25_2.fq.gz 
run with R26_1.fq.gz and R26_2.fq.gz 
run with R27_1.fq.gz and R27_2.fq.gz 
run with R28_1.fq.gz and R28_2.fq.gz 
run with R29_1.fq.gz and R29_2.fq.gz 
run with R3_1.fq.gz and R3_2.fq.gz 
run with R30_1.fq.gz and R30_2.fq.gz 
run with R31_1.fq.gz and R31_2.fq.gz 
run with R32_1.fq.gz and R32_2.fq.gz 
run with R33_1.fq.gz and R33_2.fq.gz 
run with R34_1.fq.gz and R34_2.fq.gz 
run with R35_1.fq.gz and R35_2.fq.gz 
run with R36_1.fq.gz and R36_2.fq.gz 
run with R37_1.fq.gz and R37_2.fq.gz 
run with R38_1.fq.gz and R38_2.fq.gz 
run with R39_1.fq.gz and R39_2.fq.gz 
run with R4_1.fq.gz and R4_2.fq.gz 
run with R40_1.fq.gz and R40_2.fq.gz 
run with R41_1.fq.gz and R41_2.fq.gz 
run with R42_1.fq.gz and R42_2.fq.gz 
run with R5_1.fq.gz and R5_2.fq.gz 
run with R6_1.fq.gz and R6_2.fq.gz 
run with R7_1.fq.gz and R7_2.fq.gz 
run with R8_1.fq.gz and R8_2.fq.gz 
run with R9_1.fq.gz and R9_2.fq.gz 

It seems that all the samples are with their pairs. or?

The point is that your code currently creates variables such as R13_2.fp.fq.gz2.fp.fq.gz as part of the loop because you are not parsing it properly. That code suggestion tries to address that.


