Question: Retaining duplicate mapping
0
gravatar for connor.driscoll88
15 months ago by
connor.driscoll880 wrote:

I'm trying to perform a ChIP-Exo analysis with single-end Illumina reads (~50 bp). ChIP-Exo is similar in concept to ChIP-Seq, but produces more identical reads. So I want to ensure that my alignments are allowing for different reads to map to the exact same genomic positions. The closest thing I'm seeing in the bowtie2 manual seems to be focused on the same read mapping to multiple locations, not the other way around. When I look at my current bam files in IGV, I see what looks like different reads mapped to the same position, although usually at <10x coverage.

When I use the samtools flagstat command on my bam files, my output looks like this:

26780887 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
26082701 + 0 mapped (97.39%:-nan%)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (-nan%:-nan%)
0 + 0 with itself and mate mapped
0 + 0 singletons (-nan%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

There are 0 duplicates being identified, and my understanding is that duplicates in this case are "PCR or optical duplicates." Am I correct in interpreting these as different reads mapped to identical locations?

I'm ultimately trying to find a way to align reads with duplicated mappings (different reads, same genomic position). Perhaps my current alignments are already like this and I'm getting confused by some of the terminology, but I want to ensure what I'm doing is correct.

chip-exo alignment • 508 views
ADD COMMENTlink modified 15 months ago by h.mon21k • written 15 months ago by connor.driscoll880
1
gravatar for h.mon
15 months ago by
h.mon21k
Brazil
h.mon21k wrote:

samtools flagstat will not de novo search for duplicated reads, it will just count reads marked as duplicates (e.g. by picard MarkDuplicates). So you will have to run Picard beforehand in order to flagstat see the duplicates.

[...] are "PCR or optical duplicates." Am I correct in interpreting these as different reads mapped to identical locations?

Yes, you are.

ADD COMMENTlink modified 15 months ago • written 15 months ago by h.mon21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1252 users visited in the last hour