help with modkit on nanopore sequencing data.
1
0
Entering edit mode
8 weeks ago

Hello all,

I’m working on a mouse study where I’ve merged cancer and control samples, and I’d like to perform a paired DMR analysis using modkit.

To do this, I need a BED file in the correct format for modkit. So far, I’ve tried creating bins (e.g. 300 bp or 5000 bp windows), but these are essentially random regions and don’t give me much biological insight. What I’d really like is to focus on specific features such as promoter regions or CpG islands, so that I can interpret the results more meaningfully.

A few questions I’m hoping the community might help with:

Is there a publicly available mm39 BED file with promoters or CpG islands that is compatible with modkit?

Given that my data has very low coverage, is running a paired DMR analysis likely to be informative, or am I wasting effort at this stage?

Is my current approach (fixed-size bins) reasonable for exploratory analysis, or would you recommend starting directly with annotated regions?

Any advice or pointers would be very much appreciated!

Best, Felix

mokit nanopore help • 858 views
ADD COMMENT
0
Entering edit mode

On the DMR tutorial page, they retrieve Human CpG Islands from the UCSC Table Browser. You can do the same thing for Mouse CpG Islands. Have you tried that? (link to the tutorial: https://nanoporetech.github.io/modkit/intro_dmr.html?highlight=ucs#1-perform-differential-methylation-scoring-of-genomic-regions-for-a-pair-of-samples)

Given that my data has very low coverage, is running a paired DMR analysis likely to be informative, or am I wasting effort at this stage?

What coverage do you have?

ADD REPLY
0
Entering edit mode
13 days ago
Kevin Blighe ★ 90k

Hello Felix,

Publicly available BED files for mm39 promoters and CpG islands exist and are compatible with modkit, as modkit accepts standard BED format. You can obtain these from the UCSC Table Browser at https://genome.ucsc.edu/cgi-bin/hgTables. For CpG islands, select genome "Mouse (GRCm39/mm39)", group "Regulation", track "CpG Islands", and output format "BED". Download the file directly. For promoters, select group "Genes and Gene Predictions", track "NCBI RefSeq", table "refGene", and use the "region" option to define promoters as 2 kb upstream of transcription start sites via custom filters. Alternatively, use Ensembl BioMart at https://www.ensembl.org/biomart to query mm39 for promoters or CpG islands and export as BED.

Regarding low coverage, paired DMR analysis may not be informative if coverage is below 5-10x per CpG site, as statistical power decreases and false negatives increase. Without specific coverage details (e.g., average reads per site), it is difficult to assess fully, but low coverage often leads to unreliable methylation frequency estimates in modkit. Consider aggregating data across broader regions or increasing sample size before proceeding.

Your fixed-size bin approach (e.g., 300 bp or 5000 bp windows) is reasonable for initial exploratory analysis to identify broad patterns, but switching to annotated regions like promoters or CpG islands will provide more biological context, such as links to gene regulation in cancer. Start with CpG islands for methylation-focused insights.

To generate bins if needed, use bedtools:

bedtools makewindows -g mm39.chrom.sizes -w 300 > bins_300bp.bed

Replace "mm39.chrom.sizes" with the appropriate chromosome sizes file from UCSC.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 3488 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6