Creating a loop to mark duplicates using Picard
1
0
Entering edit mode
13 months ago

Hi there,

I am trying to create a loop to run multiple sorted bam files through picard to mark duplicates. My bash script is not working, and I'm at a loss.

#! /bin/bash

picard=/home/picard/picard.jar
sbam=/data/B_bam

for bamfile in *_sorted.bam
do
    java -jar "$picard" MarkDuplicates INPUT="$sbam"/"$bamfile" OUTPUT="$sbam"/"${bamfile%_sorted.bam}"_md.bam METRICS_FILE="$sbam"/"${sbam%_sorted.bam}"_metrix
    samtools index "$sbam"/"${bamfile%_sorted.bam}"_md.bam
done

Error Message:

Cannot read non-existent file: /data/B_bam/*_sorted.bam
bash picard • 1.1k views
ADD COMMENT
1
Entering edit mode
13 months ago
METRICS_FILE="$sbam"/"${bamfile%_sorted.bam}"_metrix

instead of

METRICS_FILE="$sbam"/"${sbam%_sorted.bam}"_metrix

also check there is a bam file under in the CURRENT WORKING directory: for bamfile in *_sorted.bam but may be you wanted for bamfile in /data/B_bam/*_sorted.bam

also don't use a loop in a script shell when you can use Snakemake or Nextflow

ADD COMMENT
0
Entering edit mode

don't use a loop in a script shell when you can use Snakemake or Nextflow

People need to start somewhere. Loops -> xargs/parallel -> snakemake/nextflow is the logical progression in learning automation IMO.

I find it weird that OP uses a mix of ${} and "..."/"...." (multiple quoted strings interleaved with multiple unquoted strings). Why not simply "${sbam}/${bamfile%_sorted.bam}_metrix"?

ADD REPLY
0
Entering edit mode

Sorry, I am very new at this and am just learning my way around these symbols! Thanks for the help :)

ADD REPLY
0
Entering edit mode

For someone new at shell scripts, you're doing really well at using shell variables and parameter expansion. Good going!

ADD REPLY

Login before adding your answer.

Traffic: 2487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6