Mark Duplicates
1
0
Entering edit mode
3.3 years ago
priya.bmg ▴ 60

Hello

I have started learning the steps in NGS pipeline.

First, using Bowtie, I aligned the fastq sequence file with the reference file and the output was saved as sam file.

bowtie2 -x grch38_1kgmaj  -U 24_1.fastq,24_2.fastq -S eg1.sam 

The using samtools, sorted the sam file based on coordinates which was saved as bam file.

 sort eg1.sam > my_sorted.bam

This bam file was indexed

 samtools index my_sorted.bam

Now, I am trying to mark duplicates in this indexed file (using picard tool)

 java -jar $EBROOTPICARD/picard.jar MarkDuplicates I=my_sorted.bam.bai O=marked_duplicates.bam M=marked_dup_metrics.txt

I am stuck in this step. I did not get any output. Are the above steps correct or have I missed any step before marking duplicates?

Thanks

picard GATK NGS • 1.2k views
ADD COMMENT
2
Entering edit mode
3.3 years ago
Tm ★ 1.1k

You need to use .bam file as input and not the bam index. i.e your command should be:

 java -jar $EBROOTPICARD/picard.jar MarkDuplicates I=my_sorted.bam O=marked_duplicates.bam M=marked_dup_metrics.txt
ADD COMMENT
0
Entering edit mode

Thank you

ADD REPLY

Login before adding your answer.

Traffic: 858 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6