Question: Mitochondrial contamination over 90% in published atac-seq dataset.
0
gravatar for daniele.avancini
10 days ago by
daniele.avancini0 wrote:

Hi all,

I know this topic has been touched in the past but I couldn't find an answer to my question. Background: in my lab, we are thinking to move from chip-seq of histone marks to atac-seq. We have data on CD4+ T cells and we would like to compare what we found with an already published dataset in the same cell type (Buenrostro et al. 2013: GEO: GSE47753). In addition, I would use this as a test to see if we have the capabilities of doing an atac-seq analysis.

So I downloaded the 6 sra files regarding the CD4+ T cells, produced the fastq files with relative quality control and aligned to hg19 using bowtie2 ( -X2000 ). I know in the paper they use bowtie instead of bowtie2 but I didn't think that could make a big difference at this step. Looking at the SAM fils after the alignment, I realized that more than 90% of the reads are mapped to mitochondrial DNA leaving me with 50 to 200 thousand reads to do peak calling. I tried to align using bwa mem but no much difference was found. In the paper, they declare to have an average around 46% of reads mapping to mtDNA.

So the questions: what am I doing wrong? Did anyone else find the same thing? Am I misreading the methods of the paper?

Thanks in advance and I'm sorry for such a beginner question.

Daniele.

ADD COMMENTlink written 10 days ago by daniele.avancini0

Not trying to defend the published work but if you were trying to reproduce published analysis then first thing you should do is stick with programs/parameters used in that study. Since you have a major variable (bowtie2 can do gapped alignments as opposed to bowtie v.1) it would be improper to question published results at this time.

ADD REPLYlink modified 10 days ago • written 10 days ago by genomax39k

Questioning published results wasn't the purpose of my query at all, in fact, I'm pretty sure that the problem is at my end. I used two different aligners obtaining similar results. But you pointed out something solid and I'll use bowtie v.1 to perform the alignments. Thank for your reply.

ADD REPLYlink written 10 days ago by daniele.avancini0

Apologies for mistaking your intent. It is not the first time one finds that the published results may not be reproducible or have a significant error in them. If your result is correct then having 90% reads mapping to MT would indeed be worrisome. Since you want to compare your results to theirs you may have to redo the analysis of your data with bowtie v.1, if that looks more like what is published.

ADD REPLYlink modified 10 days ago • written 10 days ago by genomax39k

Thank again, I will post the results of the new analysis.

ADD REPLYlink written 10 days ago by daniele.avancini0

The new analysis using bowtie instead of bowtie2 gave a bit better results with an average MT mapping around 80%. Still high, but is the best I could get and I don't really know where I find this discrepancy in the data. I will call the peaks anyway and see what happens. Do you have any further suggestion? Thanks for helping!

ADD REPLYlink written 9 days ago by daniele.avancini0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 966 users visited in the last hour