Question

Comparison of two MeDip data sets

0

Entering edit mode

8.2 years ago

florian.noack ▴ 20

Hi everybody,

I have (h)MeDip data from two different cell types I would like to compare to identify D(h)MR between the two cells. I used MACS 2 to call peaks and calculated a enrichment value according to this formula:

$Embedded Image$

which is more or less the log2 fold ration of sequencing reads for a peak between antibody treated sample and non-antibody treated control (normalized by total number of reads). To compare now my two cells I calculated the log2fold change of the mean of the enrichment value (I have all experiments in triplicates) of the two cell types. Using this strategy I identify several regions with great GO terms of the nearest gene (which makes absolutely sense in the biological context of the experiment).

However I calculate a log2fold change with numbers already on a log2fold scale, sounds a bit weird for me (however iam a biologist and have no clue about mathematics in general ;) ). Furthermore I tried simply to calculate the fold change and setting up new cutoff parameters comparable to the previous used on the log2 scale, however I loss a lot of potential DMR regions.

Therefor can I continue to calculate the log2fold ration of the two enrichment values or is this simply wrong? Alternatives?

Thanks a lot,
Flo

ChIP-Seq MEDIP hMEDIP • 2.4k views

ADD COMMENT • link updated 21 months ago by Ram 43k • written 8.2 years ago by florian.noack ▴ 20

0

Entering edit mode

Why don't you just not take the log2 of the fold change? (when you take a ratio b/w sample and control)

ADD REPLY • link 8.2 years ago by Sukhi Singh 11k

0

Entering edit mode

Sorry I am still not sure which method I should use. Is the following pipeline correct if I want to identify differential methylated region between celltype A and celltype B (especially about 6 Iam not sure).

Call Peaks using MACS (for each cell type separately)
Merge the peak patterns
Count the number of reads belonging to a certain peak for each cell type separately (AB-treated sample and Input)
Calculate the enrichment value using $Embedded Image$ for both cell types separately (more or less a log2 ratio of seq. depth normalized reads in sample vs seq depth normalized reads in Input)
Subset peaks which reach at least in one of the two cell types a certain enrichment value threshold (to get ride of peaks called by MACS which have only a low enrichment in both cell types)
For each of the filtered peaks I subtract the seq. depth normalized Input reads from the seq. depth normalized sample reads for each cell type separately
Using this values to calculate a log2 fold ration between celltype A and B

Is this correct (especially step 6)?

Right now I do the same until step 5 but instead of going back to the raw reads I subtract the enrichment value of cell B from cell A to obtain a value which expressed the differences of methylation of a genomic location (peak) between the two cell types.

Thanks for your advice,
Flo

ADD REPLY • link updated 21 months ago by Ram 43k • written 8.2 years ago by florian.noack ▴ 20

score 0 · Answer 1 · 2016-02-08

0

Entering edit mode

8.2 years ago

Fidel ★ 2.0k

Not a good idea to use log2 ratios to compute other ratios. Why don't you simply compute new log2 ratios for the other samples you want to compare (you can use deeptools for this, see the documentation for bamCompare).

ADD COMMENT • link 8.2 years ago by Fidel ★ 2.0k