frequency plot for peaks
0
0
Entering edit mode
5 months ago
mavy ▴ 10

Hello,

I am computer science student and trying to work on BioMedical data and learning and applying the Bioinformatics techniques.

I have ChipSEQ data generated by my lab , of histone modification , H3k27me3, with three different samples I have applied the whole pipeline and able to call peaks as well using MACS2. the peaks shown in the results are as expected that the sample that is expected to show the highest number of peaks and the one who should show less , is accordingly.

But when I am doing the downstream analysis using Chipseeker and following the vignette the frequency plot is showing almost similar number of peaks where as the number of peaks in the excel peak file in th e MASC2 results shows 194577 and 48752 peaks respectively The code that I am using for Chipseeker is

promoter <- getPromoters(TxDb=txdb, upstream=3000, downstream=3000)
tagMatrixList <- lapply(files, getTagMatrix, windows=promoter)
plotAvgProf(tagMatrixList, xlim=c(-3000, 3000), conf=0.95,resample=500, facet="row")

MACS2 command that's I am using

macs2 callpeak -t treatedfle.bam -c inputfile.bam --gsize 3.0e9  --bdg --broad --broad-cutoff 0.1  --nomodel --extsize 125 --name treated_ --outdir /home/labs/chip/ 

Any help or suggestions are welcome.

chip-seq chipseeker • 845 views
ADD COMMENT
0
Entering edit mode

Hi,

How do you observe the number of peaks using the frequency plot / average profile ?

Could you show the plot?

ADD REPLY
0
Entering edit mode

enter image description here

This is the peak count frequency plot that I am referring to , the first sampleis supposed to have the least number of peaks but its frequency is highest .. Am I interpretting it wrong ?

ADD REPLY
1
Entering edit mode

The y-axis is peak count frequency not peak numbers. I tried to understand it through the R script of plotAvgProf. The peak count frequency is calculated for each sample separately (using apply fn). Then collated together using facet. The profile plot is telling you distribution of peaks around TSS +/- 1kb with y-axis limits ranging upto 0.0007. If any of your samples has peaks with higher frequency at any given genomic site the scale i.e y-limit will increase. Also the number you mentioned about peaks are not so low even they are too much different. So while calculating the frequency you might get a range of y-axis which is same.

ADD REPLY
0
Entering edit mode

Thankyou so much Ankit for your detailed answer.

ADD REPLY
0
Entering edit mode

hello , I have another query related to this , the data that I am analysing was previously analyzed too and those results indicate that there is dip in the TSS region where is mine are showing the highest at TSS , can you suggest what could be the reason ? I would really appreciate your response

ADD REPLY
0
Entering edit mode

what is your txdb?

can you explain your experiment in detail?

ADD REPLY

Login before adding your answer.

Traffic: 1338 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6