Question: ATAC-seq : call peaks, reproducibility and differential analysis without replicates
0
gravatar for anais1396
21 months ago by
anais139620
Brussels
anais139620 wrote:

Hello biostars community !!

I would like to know if someone had experiences with normalization and differential expression on ATAC-seq data in the case of no replicate in samples ?

I've performed Macs2 for peak calling and used IDR (irreproducible discovery rate) and intersectBed on my first set of ATAC-seq data (which contains replicates) to have peaks in common between my samples but I've a second set of data without replicates and I would like to do the same analyse...

Is there a way to do the same thing on my second set where I have no replicates ? What are the tools I've to use to do that ?

Then, I would like to make differential expression on my two sets of datas. Is the strategy the same in the 2 cases (with replicates and without replicate) ? What strategy do you use for differential expression in ATAC-seq data ?

Thank you in advance !!

Anaïs

ADD COMMENTlink modified 21 months ago by i.sudbery6.6k • written 21 months ago by anais139620
1
gravatar for i.sudbery
21 months ago by
i.sudbery6.6k
Sheffield, UK
i.sudbery6.6k wrote:

Depends what you mean by no replicates. If you mean you have, for example, a bunch of different patients, some with disease and some without, but you only have one sample from each, then you do have replicates - each patient counts as a replicate.

If you mean you have a cell line and you have treated and untreated cells and only one sample from each, then you do not have replicates.

With no replicates, the only IDR based analysis you can do is to check for internal consistency by generating pseudo-replicates and selfing for self consistency.

Without replicates you basically can't do differential analysis of your ATAC-seq. The best you would be able to do is to get your set of peaks from your unreplicated condition and ask which of those peaks are present in neither of the peak sets from any of the replicates of the replicated set, and ask which peaks that are reproducible in the replicated condition are not present in the unreplicated condition.

ADD COMMENTlink written 21 months ago by i.sudbery6.6k

I mean I have 2 groups of patients, some with disease and some healthy but in each group, I've just one sample from each patient. For instance, in the group1 (with disease), I've 6 samples (each from a different patient) and in the group 2 (healthy), I've 8 samples, each from a different person.

So if I follow what you said, I can considerer that in the group1 I have 6 replicates and in the group2 I have 8 replicates ??

ADD REPLYlink written 21 months ago by anais139620
1

Yes, that is what I am suggesting.

ADD REPLYlink written 21 months ago by i.sudbery6.6k
0
gravatar for Devon Ryan
21 months ago by
Devon Ryan93k
Freiburg, Germany
Devon Ryan93k wrote:

For the unreplicated experiment you just won't be using IDR.

I assume by "differential expression" you mean integrating your ATAC-seq data with RNA-seq data. At the end of the day that's the same regardless of whether the ATAC-seq has replicates or not. You're generally looking for DE genes downstream of or overlapping ATAC-seq peaks. You could do differential peak calling between the conditions and look at DE genes in the resulting peaks, that'd be the simplest route.

ADD COMMENTlink written 21 months ago by Devon Ryan93k

I think the OP was reffering to the analysis where you generate a pooled set of peaks between the two samples and then count the number of ATAC reads in each peak and do a count bases analysis of the difference between the two conditions.

ADD REPLYlink written 21 months ago by i.sudbery6.6k

Ah, that could be.

ADD REPLYlink written 21 months ago by Devon Ryan93k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 849 users visited in the last hour