Remove flags of MarkDuplicates (picard)
2
0
Entering edit mode
9.5 years ago
Coryza ▴ 430

Hi,

Is it possible to remove the MarkDuplicates flags (not the sequences) from a BAM file? If so, how?

duplicates flags picard samtools • 6.3k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
8
Entering edit mode
9.5 years ago

You can use Picard RevertSam for this. This tool can be used to reset various attributes of a BAM file including duplicate information. Simply use: REMOVE_DUPLICATE_INFORMATION=true

Example command:

java -Xmx7g -jar ~/tools/picard/picard-tools-1.118/RevertSam.jar OUTPUT=UnmarkedDuplicates.bam INPUT=MarkedDuplicates.bam REMOVE_DUPLICATE_INFORMATION=true
ADD COMMENT
0
Entering edit mode

RevertSam is new to me. Thanks

ADD REPLY
2
Entering edit mode
9.5 years ago

Depending on the version of awk you have on your computer then something like the following should work:

samtools view -h foo.bam | awk 'BEGIN{OFS="\t"}{if(NF>5) {if(and($2,1024)) {$2-=1024}} print $0}' | samtools view -Sbo foo.unmarked.bam -

I think Macs have mawk rather than gawk, so this doesn't work there.

ADD COMMENT
1
Entering edit mode

If you're on Mac and not using homebrew, you're missing out on a bunch of cool stuff.

ADD REPLY
0
Entering edit mode

Thanks! Worked perfectly ;)

ADD REPLY
0
Entering edit mode

Worked for me but the next stage of my pipeline (realignment using GATK) did not like the resulting BAM files.

##### ERROR MESSAGE: SAM/BAM file /home/cci/sau103/datahome/sals/scratch/Run/bowtie2/100028_S233_L001.asd.bam is malformed: Invalid file pointer: 4570
ADD REPLY

Login before adding your answer.

Traffic: 2639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6