How to draw histogram plot of methylation values for all CpGs, CHGs and CHHs contexts
0
0
Entering edit mode
3.2 years ago
anisa • 0

Hi, I’m new in conducting analysis on WGBS data. I ‘m using bismark_methylation_extractor for generating the cytosine_report. To generate methylation stats for CpG, CHG and CHH, I’m using methylkit. However, when I tried to draw histogram for methylation values for all CpGs, CHGs and CHHs contexts in my plant samples, all contexts produced the same histogram plot. Result in cytosine report looks normal (meaning we can see CpG, CHG and CHH). Does anyone face the same problem as mine? What should I do to resolve this problem?

Total number of methylation call strings processed: 394013294
Final Cytosine Methylation Report
=================================
Total number of C's analysed:   9031394272

Total methylated C's in CpG context:    678818655
Total methylated C's in CHG context:    711775038
Total methylated C's in CHH context:    271189564

Total C to T conversions in CpG context:        216481786
Total C to T conversions in CHG context:        557533973
Total C to T conversions in CHH context:        6595595256

C methylated in CpG context:    75.8%
C methylated in CHG context:    56.1%
C methylated in CHH context:    3.9%
methylkit wgbs histogram plot • 1.8k views
ADD COMMENT
0
Entering edit mode

Please show the code you ran, you may have made an error

ADD REPLY
0
Entering edit mode

library(methylKit); packageVersion("methylKit") library("ggplot2"); packageVersion("ggplot2")

define objects

file.list <- list("./DH2_cytosineReport.txt", "./DK3_cytosineReport.txt") sample.ids = list("DH2", "DK3") treatment = c(1,0)

read input files

myobjCpG = methRead(file.list, sample.id = sample.ids, assembly = "oilpalm", treatment = treatment, context = "CpG", pipeline = "bismarkCytosineReport") myobjCpG

generate coverage plots

png("DH2_CpG_coverage.png", width=600, height=600) getCoverageStats(myobjCpG[[1]], plot = TRUE, both.strands = FALSE) dev.off()

generate methylation plots

png("DH2_CpG_methylationStats.png", width=600, height=600) getMethylationStats(myobjCpG[[1]], plot = TRUE, both.strands = FALSE) dev.off()

library(methylKit); packageVersion("methylKit") library("ggplot2"); packageVersion("ggplot2")

define objects

file.list <- list("./DH2_cytosineReport.txt", "./DK3_cytosineReport.txt") sample.ids = list("DH2", "DK3") treatment = c(1,0)

read input files

myobjCHG = methRead(file.list, sample.id = sample.ids, assembly = "oilpalm", treatment = treatment, context = "CHG", pipeline = "bismarkCytosineReport") myobjCHG

generate coverage plots

png("DH2_CHG_coverage.png", width=600, height=600) getCoverageStats(myobjCHG[[1]], plot = TRUE, both.strands = FALSE) dev.off()

generate methylation plots

png("DH2_CHG_methylationStats.png", width=600, height=600) getMethylationStats(myobjCHG[[1]], plot = TRUE, both.strands = FALSE) dev.off()

ADD REPLY
0
Entering edit mode

The context argument is not a filtering argument so you need to prefilter your files beforehand. Your code is assigning successively CpG CHH CHG to the exactly the same cytosines. Depending on your column you might separate cytosine reports in 3 documents for CpG CHG CHH depending on context annotation:

awk '($yourcolumnnumber == "CHG")' ./DH2_cytosineReport.txt > ./DH2_cytosineReport_CHG.txt

Then you can use readMeth separately for all contexts.

ADD REPLY
0
Entering edit mode

Thank you very much Basti..really appreciate your help. I tried and it works. Thank you

ADD REPLY
0
Entering edit mode

I have another question. I tried 2 command line for generating cytosine report. A) bismark_methylation_extractor -p --no_overlap --ignore 4 --ignore_r2 4 --ignore_3prime 3 --ignore_3prime_r2 3 --bedgraph --buffer_size 20G --cytosine_report --CX --multicore 2 --genome_folder DH3_trim_map_pe.deduplicated.bam , B) bismark_methylation_extractor -p --no_overlap --ignore 4 --ignore_r2 4 --ignore_3prime 3 --ignore_3prime_r2 3 --bedgraph --buffer_size 20G –multicore 2 --genome_folder DH3_trim_map_pe.deduplicated.bam

coverage2cytosine --CX --gc --genome_folder -o DH3_cytosineReport.txt DH3_trim_map_pe.deduplicated.bismark.cov.gz

Both command line generated the same cytosine methylation report. But when I draw histogram for methylation values, different histogram figures for CpG, CHG and CHH context were produced even though the same input file was used in methylation calling. Which result should I used for downstream analysis?. Thanks. Upper figure from A command line and below B enter image description here enter image description here

ADD REPLY

Login before adding your answer.

Traffic: 2404 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6