Question: Remove duplicate readnames from a bam file
0
gravatar for garnerim1988
20 months ago by
garnerim19880 wrote:

Hello, I have a problem where when using picard markduplicates i get the error: 'Value was put into PairInfoMap more than once' I grep out the read causing the issue because it (for some reason) has a duplicate readNAME. I rerun picard markduplicates and still get the error 'Value was put into PairInfoMap more than once'. This time however, it is because of a different duplicate read name.

I have analysed countless paired end data and never encounter this problem before.

My question: Is there a way to get a list of ALL the duplicated readnames so i can filter them all out?

Bw, Ian.

ADD COMMENTlink written 20 months ago by garnerim19880
3

Hello garnerim1988 ,

have you tried the things mentioned in Markduplicates: Value Was Put Into Pairinfomap More Than Once ?

Or here: https://gatkforums.broadinstitute.org/gatk/discussion/10115/picard-markdup-error-value-was-put-into-pairinfomap-more-than-once

fin swimmer

ADD REPLYlink modified 20 months ago • written 20 months ago by finswimmer13k

Hi fin swimmer, thank you for your reply. Both links I have previously read but i will go through them more thoroughly again. I did triplicates of ATAC-seq across a bunch of cell lines and for some i have no problems and others i do which is what is so frustrating.

ADD REPLYlink written 20 months ago by garnerim19880

filter your BAM files to contain only primary alignments - perhaps having the read reported with multiple alignments is the source of the problem.

ADD REPLYlink written 20 months ago by Istvan Albert ♦♦ 85k

Have done this, still has the same Error.

ADD REPLYlink written 20 months ago by garnerim19880
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2243 users visited in the last hour