How can I insert a "remove duplicates" command in a samtools script?
1
0
Entering edit mode
24 months ago
José ▴ 10

I have the following script and I need to add a command to remove pcr duplicates from samples. I think I need to add the command samtools markdup -r ... after the sorting command and before the indexing command, but I don't really know how I should right the input and output in the command. It's a little bit confusing to me. Would really appreciate if someone could help.

Script:

for sample in *.sam
do
    echo $sample
    var=$(echo ${sample} | sed 's/.sam//')
    echo $var

    # Convert file from SAM to BAM format
    samtools view -Sb $sample > ${var}.uns.bam

    # Sort BAM file
    samtools sort -T /tmp/$var.sorted -o $var.bam $var.uns.bam

    # index the bam file
    samtools index ${var}.bam

    # Remove intermediate files
    rm ${var}.uns.bam
 done
samtools markdup • 830 views
ADD COMMENT
3
Entering edit mode
24 months ago
cmdcolin ★ 3.8k

I found this script from the htslib documentation is nice about using markdup http://www.htslib.org/workflow/fastq.html if you wanted to go all the way from raw reads to sorted+indexed cram file

I have this script saved as easy_align_shortreads.sh

#!/bin/bash
minimap2 -t 8 -a -x sr "$1" "$2" "$3"  | \
samtools fixmate -u -m - - | \
samtools sort -u -@2 - | \
samtools markdup -@8 --reference "$1" - --write-index "$4"

And call as ./easy_align_shortreads.sh ref.fa 1.fq 2.fq out.cram, creates out.cram and out.cram.crai using minimap2 as aligner. minimap2 can be replaced with another aligner if needed

If you want to keep your script as is, then you can add something like

samtools markdup -@8 --reference ref.fa $var.bam $var.markdup.bam

and add the -r to remove duplicates

samtools markdup -r -@8 --reference ref.fa $var.bam $var.markdup.bam
ADD COMMENT
1
Entering edit mode

note also: to use markdup: "The input file must be coordinate sorted and must have gone through fixmates with the mate scoring option on." you don't mention having run fixmates before so make sure that is done

ADD REPLY
1
Entering edit mode

Thank you very much, that was really helpful and you are absolutely right, i totally forgot to run fixmates.

ADD REPLY

Login before adding your answer.

Traffic: 1803 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6