Picard Base Distribution by Cycle and adapter contamination
1
0
Entering edit mode
13 months ago
ElCascador ▴ 20

Hi,

I have trouble interpreting the CollectMultipleMetrics.base_distribution_by_cycle plot from picard for atac-seq data

In my example, there's weird patterns at what looks like the begining of each paired end sequence. Is it a direct reflection of tthe fastqc sequence content across all bases? I am worried about adapter contamination but the picard plot is the same with out without adapter removal.

The picard plot: Picard

The fastqc plot:

fastqc

Can I just take to heart this post about the fastqc metric and call it a day ?

atac-seq picard • 360 views
ADD COMMENT
1
Entering edit mode

If these are libraries made by nextera (transposon) then they show a similar pattern as the random primed ones (in blog post you linked). You can move forward with the rest of analysis.

ADD REPLY
0
Entering edit mode
13 months ago
h.mon 32k

As genomax pointed out, most likely you can move forward with your analysis without further concern. However, if you really want to check, you can use bbmap.sh (from the BBTools/BBMap suite) with mhist=mhist.txt, the mhist.txt file will contain an histogram of matches / mismatches by position of the reads in relation to the reference genome. From the bbmap.sh help:

 mhist=<file>        Histogram of match, sub, del, and ins rates by read location.
ADD COMMENT

Login before adding your answer.

Traffic: 2346 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6