Question

Mark Duplicates

0

Entering edit mode

3.3 years ago

priya.bmg ▴ 60

Hello

I have started learning the steps in NGS pipeline.

First, using Bowtie, I aligned the fastq sequence file with the reference file and the output was saved as sam file.

bowtie2 -x grch38_1kgmaj  -U 24_1.fastq,24_2.fastq -S eg1.sam

The using samtools, sorted the sam file based on coordinates which was saved as bam file.

 sort eg1.sam > my_sorted.bam

This bam file was indexed

 samtools index my_sorted.bam

Now, I am trying to mark duplicates in this indexed file (using picard tool)

 java -jar $EBROOTPICARD/picard.jar MarkDuplicates I=my_sorted.bam.bai O=marked_duplicates.bam M=marked_dup_metrics.txt

I am stuck in this step. I did not get any output. Are the above steps correct or have I missed any step before marking duplicates?

Thanks

picard GATK NGS • 1.2k views

ADD COMMENT • link 3.3 years ago by priya.bmg ▴ 60

score 2 · Accepted Answer · 2021-07-29

2

Entering edit mode

3.3 years ago

Tm ★ 1.1k

You need to use .bam file as input and not the bam index. i.e your command should be:

 java -jar $EBROOTPICARD/picard.jar MarkDuplicates I=my_sorted.bam O=marked_duplicates.bam M=marked_dup_metrics.txt