Question: How to specify one or more input files in PICARD Mark duplicates tool??? How to specify the paired end reads???
0
gravatar for ashwinireddy.challa
3.4 years ago by
ashwinireddy.challa10 wrote:

Hello All:

I am trying to remove duplicates using PICARD. I have two individual files - one from control and another from treatment. I would like to remove duplicates from both the files using a single command line. Can anyone suggest me how to specify multiple input files in Picard mark duplicates??? and also how to specify paired end reads?? Am really confused.

Thanks in advance.

ADD COMMENTlink modified 3.4 years ago by Kevin Blighe71k • written 3.4 years ago by ashwinireddy.challa10
0
gravatar for Kevin Blighe
3.4 years ago by
Kevin Blighe71k
Republic of Ireland
Kevin Blighe71k wrote:

Picard MarkDuplicates only identifies and marks optical/PCR duplicates from an alignment file in SAM/BAM format. It does not accept 'raw' unaligned reads.

To identify and remove duplicates in an aligned BAM file using Picard and SAMtools, use:

samtools sort MySample_Aligned.bam -o MySample_Aligned_Sorted.bam

java -jar MarkDuplicates.jar INPUT=MySample_Aligned_Sorted.bam OUTPUT=MySample_Aligned_Sorted_PCRDupes.bam ASSUME_SORTED=true METRICS_FILE=MySample_Aligned_Sorted_PCRDupes.txt

samtools index MySample_Aligned_Sorted_PCRDupes.bam

samtools view -b -F 0x400 MySample_Aligned_Sorted_PCRDupes.bam > MySample_Aligned_Sorted_PCRDupesRemoved.bam

samtools index MySample_Aligned_Sorted_PCRDupesRemoved.bam

For removing these from FASTQ files, see here: Removing PCR duplicates from .fastq without .bam alignment

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by Kevin Blighe71k

Sorry, may be am not clear with my question. I have two SAM/BAM files - one from control and another one from treatment. So, now as i am interested in marking and removing duplicates from SAM/BAM files using PICARD, i would like to know how to specify multiple single end SAM/BAM files and also paired end files in a command line.

ADD REPLYlink written 3.4 years ago by ashwinireddy.challa10

You can only remove duplicates from one sample at a time. If you want to do it on a few files, set up a loop in BASH (or some other shell with which you are familiar).

ADD REPLYlink written 3.4 years ago by Kevin Blighe71k

Also take a look here, where the same question was asked. The solution is effectively what I have just mentioned in my recent response: http://seqanswers.com/forums/showthread.php?t=66969

Trust this helps.

ADD REPLYlink written 3.4 years ago by Kevin Blighe71k
1

Thank you so much Kevin!!! I will look into these links.

ADD REPLYlink written 3.4 years ago by ashwinireddy.challa10

Great - no problem! :)

ADD REPLYlink written 3.4 years ago by Kevin Blighe71k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1104 users visited in the last hour
_