mitochondrial blacklists ATAC-Seq
2
3
Entering edit mode
8.2 years ago
simonjean434 ▴ 70

Hello all,

Buenrostro et al 2015 mentioned a custom blacklist for mitochondrial homologs for hg19 and mm10. Does anyone know if this list is publically available somewhere?

Single-cell chromatin accessibility reveals principles of regulatory variation

Buenrostro JD, Wu B, Litzenburger UM, Ruff D, Gonzales ML, Snyder MP, Chang HY, Greenleaf WJ.

Nature. 2015 Jul 23;523(7561):486-90. doi: 10.1038/nature14590. Epub 2015 Jun 17.

Thanks

S

ATAC-SEQ • 6.9k views
ADD COMMENT
1
Entering edit mode
8.2 years ago
James Ashmore ★ 3.4k

I don't know if it is available, you could try contacting the authors or ask on their ATAC-seq forum. You could also try creating your own mitochondrial blacklist (this is what I did when I was analysing my ATAC-seq data). Use wgsim to simulate reads from the mitochondrial chromosome , align them to your reference genome, call peaks using MACS and then use the called peak regions as your blacklist. For consistency I generated reads which were the same read length as the data I was analysing.

ADD COMMENT
1
Entering edit mode
7.1 years ago
BioinfGuru ★ 1.7k

You've confused 2 separate things that must be removed: (edit: apologies...you didn't)

  1. Mitochondrial reads
  2. blacklisted regions

To remove the mitochondrial reads Just index your BAM files and then remove chrM from each of them.

$ for i in *.bam; do non_chrM_list=$(samtools view -H $i | grep chr | cut -f2 | sed 's/SN://g' | grep -v chrM) samtools view -b $i $non_chrM_list -o [outfilename.bam]; done;

To check that all chrM reads are removed:

grep chrM [outfilename.bam]

for blacklist regions see here and here

mm10    http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/mm10-mouse/

Here is a post on removing mito contaminants. Here is another.

ADD COMMENT
1
Entering edit mode

I don't think the user has confused anything. The authors of ATAC-seq created a mitochondrial blacklist which represents high signal regions on the nuclear genome caused by read sequence homology with the mitochondrial genome. The blacklist files have now been uploaded by the authors, so you can get it directly from them now, instead of generating your own list.

ADD REPLY
1
Entering edit mode

wow.

Im so glad I posted that answer. I'm literally about to start removing the blacklist.... and I didn' know about that link. Thank you James.

Is there a link without going through google forums? I need permission for access.

ADD REPLY
1
Entering edit mode

These homologous regions are called NUMTs

ADD REPLY
0
Entering edit mode

So Jeremy/James ... Are NUMTs recognised by bowtie2 and annotated with chrM?

The code above removes only reads from a BAM file that the aligner (bowtie2 in my case) annotates with 'chrM'

Can I trust that all NUMTs are gone?

ADD REPLY
1
Entering edit mode

real NUMTs, assuming transposase binds to them, would map to the NUMT regions you list above. I don't know whether they would also align to chrM. Better to blacklist both chrM and the regions.

ADD REPLY
0
Entering edit mode

Thank you Jeremy.

  1. ATAC_seq author mitochondrial blacklist

  2. ENCODE signal artifact blacklist

Using the following command I have now removed the contents of both blacklists from my bed files (of course I had already removed chrM using the command I originally posted above):

for i in *.bedfile; do bedtools intersect -v -a $i -b [PATH]/mitochondrial.blacklist.bed [PATH]/signal.artifact.blacklist.bed > $i.bed; done

Be careful to not create an infinite loop with this command (all the files may end in .bed)

ADD REPLY

Login before adding your answer.

Traffic: 2573 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6