How to adjust covariates with RRBS data?
0
0
Entering edit mode
15 months ago
Dff045 • 0

I want to adjust covariates (like age, BMI etc.) with RRBS data and then identify differentially methylated regions (DMRs) in control vs treated conditions.

I have 50 bismarck coverage files (50 samples, control = 25 samples, treated = 25 samples). The amount of data is huge and computationally difficult to handle. Each sample file has around 30 to 40 million CpGs.

I am looking for some easier methods so that I can handle such data on PC, and determine DMRs in treated conditions by adjusting covariates.

So far I know packages like DSS but I don't think it can adjust 2-3 covariates and it is also computationally difficult to handle. Methylkit can handle covariates as I checked the manual, but I think even that can be computationally challenging, as I don't have resources to run. I have a normal laptop with 8GB RAM. [I tried loading all 50 bismarck files in R and created BS-seq object using DSS pacakge, but my system crashed after a while.]

Hence, I am looking for less extensive methods that can easily handle this type of data and adjust covariates.

This is the format of bismarck files (per sample) I have.

example format of bismarck files per sample

  1. Does anyone have any ideas how to adjust covariates using easier methods which can be executed on a normal PC/laptop?
  2. I recently got to know EdgeR can be used. But I am not sure how to generate the input CpG matrix. Can someone suggest how to create a matrix (30-40 million rows x 50 columns) from the bismarck files I have?
  3. For methods like limma, what kind of input will be required?

Does it take CpG matrix? and if yes, then what values should be present in the matrix - count values (raw methyl reads count) or methylation level values (methylation proportion) for DMR analysis ?

For limma, I am not able to find any manual related to RRBS data. Is limma a good option for handling RRBS data?

I am quite new to the RRBS data and hence looking for any feasible options. I am aware of RNA seq pipelines using edger and limma, hence wanted to know if I can use these to handle RRBS bismarck files and computationally feasible. If anyone can please advise me how to go about this analysis? All I need is to adjust for confounding factors and determine DMRs in treated condition, so that the results I get will only be due to the treatment and not because of any confounding factors. And I have only these 50 bismarck files to begin with, where I am struggling computationally. Please help me out.

Thanks!

Covariates RRBS • 377 views
ADD COMMENT

Login before adding your answer.

Traffic: 3843 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6