For loop STAR not working
0
0
Entering edit mode
5 weeks ago

Hi,

I'm trying to run STAR on multiple samples. I used the following code:

lis=$(ls *R1_001_out.fastq)
for file in $lis
do base=${file%R1*}
echo $base
read1=/home/vinisha/globus/Nadine/Tgen_choroid_samples/transcriptomic_analysis/unzip/${base}R1_001_out.fastq
read2=/home/vinisha/globus/Nadine/Tgen_choroid_samples/transcriptomic_analysis/unzip/${base}R2_001_out.fastq
/home/vinisha/STAR-2.5.2b/source/STAR-2.5.2b --genomeDir /home/vinisha/STAR-2.5.2b/source \
--readFilesIn $read1 $read2 \
--runThreadN 9 \
--sjdbGTFfile /home/vinisha/STAR-2.5.2b/source/Homo_sapiens.GRCh37.75_new.gtf \
--outFileNamePrefix /home/vinisha/globus/TGEN_STAR/${base}/out_ \
--runMode alignReads \
--outSAMtype BAM Unsorted \
--outSAMmode Full \
--outSAMstrandField intronMotif \
--outFilterType BySJout \
--outSAMunmapped Within \
--outSAMmapqUnique 255 \
--outFilterMultimapNmax 20 \
--outFilterMismatchNmax 999 \
--outFilterMismatchNoverLmax 0.1 \
--alignMatesGapMax 1000000 \
--seedSearchStartLmax 50 \
--alignIntronMin 20 \
--alignIntronMax 1000000 \
--alignSJoverhangMin 18 \
--alignSJDBoverhangMin 18 \
--chimSegmentMin 18 \
--chimJunctionOverhangMin 18 \
--outSJfilterOverhangMin 18 18 18 18 \
--alignTranscriptsPerReadNmax 50000 \
--limitBAMsortRAM 31000000000
done;

It runs for the first sample, the second one starts but it gets killed. Did anyone experience the same?

Thanks,
Vinisha

for RNASeq STAR • 365 views
ADD COMMENT
0
Entering edit mode

What's the error message?

ADD REPLY
0
Entering edit mode

I don't get any error. The log file is clean but it stops right after thread 8.

Last few lines of log file:

Created thread # 1
Created thread # 2
Created thread # 3
Created thread # 4
Created thread # 5
Created thread # 6
Created thread # 7
Created thread # 8
ADD REPLY
0
Entering edit mode

put echo before /home/vinisha/STAR-2.5.2b/source/STAR-2.5.2b to debug and see what's happening. And use a workflow manager.

ADD REPLY
0
Entering edit mode

I run the code in a script. I added a log file, and it looks like:

ALS93D_S15_
Jun 16 09:52:15 ..... started STAR run
Jun 16 09:52:15 ..... loading genome
Jun 16 09:52:50 ..... processing annotations GTF
Jun 16 09:53:26 ..... inserting junctions into the genome indices
Jun 16 09:56:00 ..... started mapping
Jun 16 13:30:42 ..... finished successfully
ALS93E_S9_
Jun 16 13:30:43 ..... started STAR run
Jun 16 13:30:43 ..... loading genome
Jun 16 13:31:18 ..... processing annotations GTF
Jun 16 13:31:48 ..... inserting junctions into the genome indices
Jun 16 13:34:21 ..... started mapping

It stops here.

ADD REPLY
0
Entering edit mode

Print all variables before you run the code. In addition, in the loop, do (middle arguments before done) must end with ;. I do not see that. Is it copy/paste problem?

ADD REPLY
0
Entering edit mode

The variables are good, I checked them. The thing is sometimes it executes for two iterations and then stops during the third and most of the times it stops in the second iteration. I run it on a CentOS server which has 32gb RAM, I connect via putty. Is there any other reason that it gets terminated?

ADD REPLY
1
Entering edit mode

probably, your ssh session is getting timed out. try using screen or tmux to keep the process in background. Even if it disconnects from the server, it would be still running in the back ground. Check also the system resources you have allocated for this process, for eg threads (9 out of ?), RAM (31gb for bam IO). Some times it could be heating problem. Delay the loop by 30s or 1 min.

ADD REPLY

Login before adding your answer.

Traffic: 1624 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6