Question: Is ChIP-seq control file in peak calling must whole genome size??
gravatar for dhehdqls
20 months ago by
dhehdqls0 wrote:

i try to peak calling for benchmark data by MACS2.

suppose i want to use subsection of genome chr1:3,000,000-3,100,000.

so this is my input : < JUN_K562.bam , JUN_control.bam >.

and i want to check result by to modify threshold value. like a Q value , broad-cutoff in broad calling.

however, if i slice JUN_control.bam by that region ( 3M to 3M + 100K ) same as JUN_K562.bam,

there is no change on result file even i keep modify Q-value. i don`t know why.

but when i try this with whole genome size control file, it was worked.

so, this is my question. when i want to peak calling with small subsection of genome,

control file must be whole genome size?

tool chip-seq sequence macs • 687 views
ADD COMMENTlink modified 20 months ago by geek_y9.4k • written 20 months ago by dhehdqls0
gravatar for geek_y
20 months ago by
geek_y9.4k wrote:

MACS2 calculates the genome wide background using the formula (the_number_of_control_reads*fragment_length)/genome_size and uses this information to estimate the raw local bias ( small local background + large local background + genome wide background ).

So definitely the number of reads in control file affects the background noise calculation.

MACS2 also scales down the ChIP and control to same sequencing depth ( after calculating raw local bias ) to estimate the local lambda. If you have too few reads in your treatment or control, they will be scaled down. This might also effect your results.

At the end, its NOT A GOOD IDEA to call peaks on subset of data. Call peaks on all the data and use the peaks falling in regions of interest.

ADD COMMENTlink modified 20 months ago • written 20 months ago by geek_y9.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1518 users visited in the last hour