how to do Manhatton plot for significant DNA methylation sites associations
2
0
Entering edit mode
22 months ago
Bioinfonext ▴ 480

Hi,

I do have a limma output file for significant association, could you please suggest how to plot Manhatton plot for these sites;

sites <- read.table("cpg.site.txt", header=T, sep=",")

head(sites)

                  logFC           AveExpr       t                  P.Value              adj.P.Val               B           UCSC_RefGene_Name   UCSC_RefGene_Group  UCSC_RefGene_Name
cg17944885  -0.45182214 2.558778887 -15.5883061 1.29449893946335e-42    9.91956414325614e-37    84.49688759 0   0   chr19
cg03546163  0.255316075 -0.802081399    12.08720314 1.21449935724625e-28    4.65326927233398e-23    53.61163067 FKBP5;FKBP5;FKBP5;FKBP5 5'UTR;5'UTR;5'UTR;5'UTR chr6
cg05165263  -0.185316318    0.081109579 -10.79707879    7.43220006052377e-24    1.89839695185951e-18    43.01154759 IRF2    Body    chr4

Many thanks,

R array EPIC statistics Epigenetics • 2.0k views
ADD COMMENT
2
Entering edit mode
7 days ago
Đorđe ▴ 20

Hello! This probably comes a lot later than you would like so. However, I would like to showcase my implementation. First of all, you will need extract the EPIC (or whichever array type you used) manifest for the mapping info.

manifest <- sesameDataGet("EPICv2.address")

From there you will create an annotation DF:

anno <- data.frame( CpG = manifest$ordering$Probe_ID, chr = manifest$hg38@seqnames@values, pos = manifest$hg38@ranges@start )

Inner join the two: anno_stat <- inner_join(dmp, anno, by ="CpG")

Since some of the variables are string type, we need to change them to int:

anno_stat$chr <- gsub("chr", "", anno_stat$chr)

anno_stat$chr <- recode(anno_stat$chr, X = "23", Y = "24", MT = "25")

anno_stat$chr <- as.numeric(anno_stat$chr)

Remove non-standard and NAs:

anno_stat <- anno_stat %>% filter(grepl("^chr[0-9XY]+$", paste0("chr", chr))) %>% filter(!is.na(chr), !is.na(pos), !is.na(P.Value))

And then we can finally plot:

library(qqman) manhattan(anno_stat, chr = "chr", bp = "pos", p = "P.Value", snp = "CpG")

Hope this helps, I tried sourcing the web, but I haven't stumbled on any CpG implementations yet.

Best, Dorde Racic

ADD COMMENT
1
Entering edit mode
22 months ago
dthorbur ★ 3.2k

You can't really plot a manhattan plot with just 3 loci. They typically describe a distribution of values across an entire contig/chromosome/genome. Otherwise, a manhattan plot is just a scatter plot organised by genomic distance on the x and a given value on the y.

Here is a tutorial on how to do them in R with and without ggplot2. Halfway down the page are a few examples specifically of manhattan style plots.

EDIT: I realise there is the command head(sites), but it only printed 3 rows, when the default is 6 in R, hence my concern about the number of data points.

ADD COMMENT
0
Entering edit mode

thanks, I just put 3 loci here, but I do have some more in the output list.

ADD REPLY

Login before adding your answer.

Traffic: 3180 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6