Multiple file genome alignment with STAR
0
0
Entering edit mode
2.6 years ago

Hello, I have a question about STAR alignment with multiple files. I am using it from server and so my memory is limited (I have some restrictions unfortunately). So I have 12 files can I align them by grouping them as 6 to 6. In the first run; 6 of them and in the last run the other 6 of them. Does it create a problem for further processes such as quantification and diff gene exp analysis?

#!/bin/bash
#SBATCH -p hamsi
#SBATCH -A proj2
#SBATCH -c 28
#SBATCH -N 1
#SBATCH -t 0-4:00
#SBATCH -J star_alignment
#SBATCH -o star_alignment_%j.out
#SBATCH -e star_alignment_%j.err

cd ~/GSE121634_HCC4006

for file in $(cat ./acc_number.txt); 
do
~/STAR-2.7.10a/bin/Linux_x86_64_static/./STAR --runThreadN 28 \
--genomeDir ~/index/ \
--readFilesIn ./${file}_1.fastq.gz ./${file}_2.fastq.gz \
--readFilesCommand zcat \
--outFileNamePrefix ~/GSE121634_HCC4006/${file} \
--outSAMtype BAM SortedByCoordinate
done

It gives me a memory error then I tried without for loop.

#!/bin/bash
#SBATCH -p hamsi
#SBATCH -A proj2
#SBATCH -c 28
#SBATCH -N 1
#SBATCH -t 0-4:00
#SBATCH -J star_alignment
#SBATCH -o star_alignment_%j.out
#SBATCH -e star_alignment_%j.err

cd ~/GSE121634_HCC4006/Alignment

~/tools/STAR-2.7.10a/bin/Linux_x86_64_static/./STAR --runThreadN 28 \
--genomeDir ~/index/ \
--readFilesIn SRR8088215_1.fastq.gz,SRR8088216_1.fastq.gz,SRR8088217_1.fastq.gz,SRR8088218_1.fastq.gz,SRR8088219_1.fastq.gz,SRR8088220_1.fastq.gz,SRR8088221_1.fastq.gz,SRR8088222_1.fastq.gz,SRR8088223_1.fastq.gz,SRR8088224_1.fastq.gz,SRR8088225_1.fastq.gz,SRR8088226_1.fastq.gz,SRR8088227_1.fastq.gz,SRR8088228_1.fastq.gz,SRR8088229_1.fastq.gz,SRR8088230_1.fastq.gz,SRR8088231_1.fastq.gz,SRR8088232_1.fastq.gz SRR8088215_2.fastq.gz,SRR8088216_2.fastq.gz,SRR8088217_2.fastq.gz,SRR8088218_2.fastq.gz,SRR8088219_2.fastq.gz,SRR8088220_2.fastq.gz,SRR8088221_2.fastq.gz,SRR8088222_2.fastq.gz,SRR8088223_2.fastq.gz,SRR8088224_2.fastq.gz,SRR8088225_2.fastq.gz,SRR8088226_2.fastq.gz,SRR8088227_2.fastq.gz,SRR8088228_2.fastq.gz,SRR8088229_2.fastq.gz,SRR8088230_2.fastq.gz,SRR8088231_2.fastq.gz,SRR8088232_2.fastq.gz
--readFilesCommand zcat \
--outSAMtype BAM SortedByCoordinate

So will it create a problem by running them separately? Thanks in advance

alignment rna-seq star • 875 views
ADD COMMENT
1
Entering edit mode

Why do you need to run all the samples together? Do alignment for paired reads for each sample one by one.

If required you can concat the bam files after alignment.

ADD REPLY
0
Entering edit mode

You may try with unsorted bam and then sort it with samtools.

ADD REPLY

Login before adding your answer.

Traffic: 2254 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6