Question: Normalizing ChIP-seq and ATAC-seq data?
1
gravatar for star
23 months ago by
star270
Netherlands
star270 wrote:

I have some Single end and Paired end ChIP-seq and ATAC-seq data. I have done aligning using Bowtie2 and ignored duplicated reads using Picard, then performed peak calling on Bam file using MACS2 (I just need peaks and I will not continue with Differential analysis).

I would like to normalize data based on sequence depth and I found I can do it using bamCoverage based on bam file and using cqn packages based on number of reads for each peaks. I prefer to do on bam file using bamCoverage but its out put is bigwig/bedgraph that is not suitable for peak calling by MACS2.

I would like to know is there any solution for that?

ADD COMMENTlink modified 23 months ago • written 23 months ago by star270
0
gravatar for ATpoint
23 months ago by
ATpoint44k
ATpoint44k wrote:

You can simply subsample the BAM files to match the one with the fewest reads determined by samtools flagstat

Once you have that you can use this code, e.g. to subsample all files to 10mio reads:

#!/bin/bash
## Subsample BAM to a given number of reads:

function SubSample {
## code inspired by: http://crazyhottommy.blogspot.com/2016/05/downsampling-for-bam-files-to-certain.html
FACTOR=$(samtools idxstats $1 | cut -f3 | awk -v COUNT=$2 'BEGIN {total=0} {total += $1} END {print COUNT/total}')

if [[ $FACTOR > 1 ]]
  then 
  echo '[ERROR]: Requested number of reads exceeds total read count in' $1 '-- exiting' && exit 1
fi

sambamba view -s $FACTOR -t 2 -f bam -l 5 $1

}

export -f SubSample

ls *.bam | parallel "SubSample {} 10000000 > {.}_subsampled.bam"
ADD COMMENTlink modified 23 months ago • written 23 months ago by ATpoint44k

Thank for your reply.

For ChIp-seq that we use Control for peak calling using MACS2, I think the MACS2 does normalizing, in that case is it necessary to do normalizing on bam file before peak calling?

But for ATAC seq, can we do normalizing after peak calling?e.g. using RPM?

ADD REPLYlink written 23 months ago by star270

I think the MACS2 does normalizing

True

But for ATAC seq, can we do normalizing after peak calling?e.g. using RPM?

To do what, displaying the data on a browser? You said you are only interested in peak presence. Please explain.

ADD REPLYlink written 23 months ago by ATpoint44k

Thanks, yes I got different ATAC seq data and ChIP seq data and re-analyzing them and I want to see just their peaks in browser but if I want to compare them together.Can I normalize ATAC seq peak?

ADD REPLYlink written 23 months ago by star270

Compare them by eye? I do not understand.

ADD REPLYlink written 23 months ago by ATpoint44k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1929 users visited in the last hour
_