Ending up with only one BAM file
1
0
Entering edit mode
6 weeks ago
salman_96 ▴ 60

Hi,

I am working on a variant calling pipeline. Currently, I am trying to generate BAM files from raw fastq files.

The files are paired end end seq files and named like this (there are more than 20 files)

• AB03_T1_R1_001_trimmed.fastq AB03_T1_R2_001_trimmed.fastq
• AB05_T1_2D_R1_001_trimmed.fastq AB05_T1_2D_R2_001_trimmed.fastq
• AB09T1_2D_R1_001_trimmed.fastq AB09T1_2D_R2_001_trimmed.fastq

The code is :

cd /data/Data/Results

genome=/data/Data/Genome/GRCh38.primary_assembly.genome.fa

for fq1 in /data/Data/*_R1_001_trimmed.fastq

do
echo "working with file $fq1" base=$(basename $fq1 _R1_001_trimmed.fastq) echo "basename is$base"

fq1=/data/Data/${base}_R1_001_trimmed.fastq fq2=/data/Data/${base}_R2_001_trimmed.fastq
--runThreadN 30 --readFilesIn ${fq1}${fq2} --outSAMtype BAM SortedByCoordinate --outFileNamePrefix STARpaired

done


The resulting file I am getting is only one .bam file named STARpairedAligned.sortedByCoord.out.bam

Can anyone assist why do I get only one file and how to fix this problem as I need individual files for each sample as I want to run Variant call on individual samples?

Regards, Salman

bam calling star variant files • 279 views
0
Entering edit mode
for fq1 in /data/Data/*_R1_001_trimmed.fastq


you should use a workflow manager like nextflow or snakemake

0
Entering edit mode

You should get used right away to never work with uncompressed files. It just doesn't scale and takes up disk space which can be limited on shared clusters with quotas. Gzip your fastq files, most trimmers offer that. Same with the BAM files from STAR, use --outBAMcompression 5 for the STAR command to get BAM compression, otherwise the BAM files take up just hilariously much space. STAR is so bad at providing good defaults for simple things like that, but it is great at mapping and that is what counts most.

2
Entering edit mode
6 weeks ago

you're always using the very same output file whatever is the input in the loop.

--outFileNamePrefix STARpaired


use something like

 --outFileNamePrefix "\${base}STARpaired"