Question: Normalizing ChIP-seq and ATAC-seq data?
0
gravatar for star
8 days ago by
star110
Netherlands
star110 wrote:

I have some Single end and Paired end ChIP-seq and ATAC-seq data. I have done aligning using Bowtie2 and ignored duplicated reads using Picard, then performed peak calling on Bam file using MACS2 (I just need peaks and I will not continue with Differential analysis).

I would like to normalize data based on sequence depth and I found I can do it using bamCoverage based on bam file and using cqn packages based on number of reads for each peaks. I prefer to do on bam file using bamCoverage but its out put is bigwig/bedgraph that is not suitable for peak calling by MACS2.

I would like to know is there any solution for that?

ADD COMMENTlink modified 8 days ago • written 8 days ago by star110
0
gravatar for ATpoint
8 days ago by
ATpoint13k
Germany
ATpoint13k wrote:

You can simply subsample the BAM files to match the one with the fewest reads determined by samtools flagstat

Once you have that you can use this code, e.g. to subsample all files to 10mio reads:

#!/bin/bash
## Subsample BAM to a given number of reads:

function SubSample {
## code inspired by: http://crazyhottommy.blogspot.com/2016/05/downsampling-for-bam-files-to-certain.html
FACTOR=$(samtools idxstats $1 | cut -f3 | awk -v COUNT=$2 'BEGIN {total=0} {total += $1} END {print COUNT/total}')

if [[ $FACTOR > 1 ]]
  then 
  echo '[ERROR]: Requested number of reads exceeds total read count in' $1 '-- exiting' && exit 1
fi

sambamba view -s $FACTOR -t 2 -f bam -l 5 $1

}

export -f SubSample

ls *.bam | parallel "SubSample {} 10000000 > {.}_subsampled.bam"
ADD COMMENTlink modified 8 days ago • written 8 days ago by ATpoint13k

Thank for your reply.

For ChIp-seq that we use Control for peak calling using MACS2, I think the MACS2 does normalizing, in that case is it necessary to do normalizing on bam file before peak calling?

But for ATAC seq, can we do normalizing after peak calling?e.g. using RPM?

ADD REPLYlink written 8 days ago by star110

I think the MACS2 does normalizing

True

But for ATAC seq, can we do normalizing after peak calling?e.g. using RPM?

To do what, displaying the data on a browser? You said you are only interested in peak presence. Please explain.

ADD REPLYlink written 8 days ago by ATpoint13k

Thanks, yes I got different ATAC seq data and ChIP seq data and re-analyzing them and I want to see just their peaks in browser but if I want to compare them together.Can I normalize ATAC seq peak?

ADD REPLYlink written 8 days ago by star110

Compare them by eye? I do not understand.

ADD REPLYlink written 8 days ago by ATpoint13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1863 users visited in the last hour