Question: Is ChIP-seq control file in peak calling must whole genome size??
gravatar for dhehdqls
2.5 years ago by
dhehdqls0 wrote:

i try to peak calling for benchmark data by MACS2.

suppose i want to use subsection of genome chr1:3,000,000-3,100,000.

so this is my input : < JUN_K562.bam , JUN_control.bam >.

and i want to check result by to modify threshold value. like a Q value , broad-cutoff in broad calling.

however, if i slice JUN_control.bam by that region ( 3M to 3M + 100K ) same as JUN_K562.bam,

there is no change on result file even i keep modify Q-value. i don`t know why.

but when i try this with whole genome size control file, it was worked.

so, this is my question. when i want to peak calling with small subsection of genome,

control file must be whole genome size?

tool chip-seq sequence macs • 893 views
ADD COMMENTlink modified 2.5 years ago by geek_y10k • written 2.5 years ago by dhehdqls0
gravatar for geek_y
2.5 years ago by
geek_y10k wrote:

MACS2 calculates the genome wide background using the formula (the_number_of_control_reads*fragment_length)/genome_size and uses this information to estimate the raw local bias ( small local background + large local background + genome wide background ).

So definitely the number of reads in control file affects the background noise calculation.

MACS2 also scales down the ChIP and control to same sequencing depth ( after calculating raw local bias ) to estimate the local lambda. If you have too few reads in your treatment or control, they will be scaled down. This might also effect your results.

At the end, its NOT A GOOD IDEA to call peaks on subset of data. Call peaks on all the data and use the peaks falling in regions of interest.

ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by geek_y10k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 872 users visited in the last hour