Question: macs2 "Effective genome size" for repetitive genomes
0
gravatar for Menachem Sklarz
2.1 years ago by
European Union
Menachem Sklarz10 wrote:

Hi everyone

I'm working on a chip-seq experiment in Wheat, which has a very large and repeptitive genome.

I'm a bit baffled by the "effective genome size" parameter in macs2. I understand it is related to the repetitiveness of the genome but I'm not sure how to calculate it. I've tried GEM but it gave me an error, so in parallel to trying to solve the GEM problem, maybe someone has an alternative?

Secondly, if I'm looking for peaks in repeptitive as well as non-repetitve regions of the genome, I thought maybe I should use the full length rather than the mappable length. Am I correct?

Finally - if I have a control sample (no antibody), can that be used to estimate the mappability of the genome?

Thanks!

chip-seq • 1.6k views
ADD COMMENTlink modified 2.1 years ago by geek_y8.1k • written 2.1 years ago by Menachem Sklarz10
1
gravatar for geek_y
2.1 years ago by
geek_y8.1k
Barcelona/London
geek_y8.1k wrote:

Effective genome size is after removing the repetetive elements in the genome. So you need to get the uniqely mappable regions.

Though directly not related, few options are given here https://github.com/fidelram/deepTools/wiki/General-deepTools-FAQs#effGenomeSize

Of which GEM-Mappability Calculator would be useful for you.

You may not be able to use the control sample for calculating the mappability as it would not cover the entire genome.

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by geek_y8.1k

Thanks for the input and for the links. 

Do you think the "effective genome size" should be calculated the same way I'm doing the mapping? For example, if I'm retaining only uniquely mapped reads, then I should calculate the mappability as uniquely mappable regions but if I'm retaining also reads that mapped twice or three times then maybe I should determine the mappability as the regions that are mappable two- or three times?

Is there a site you know of that explains the ststistics behind the effective size?

Thanks

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Menachem Sklarz10

There is no complicated statistics behind "effective genome size".  This Why Does Macs Use A Genome Size Of 2.7 Billion Instead Of 3 Billion For Human? might be useful and may be this paper. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030377

ADD REPLYlink written 2.1 years ago by geek_y8.1k

Thanks a lot for the links! 

ADD REPLYlink written 2.1 years ago by Menachem Sklarz10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1425 users visited in the last hour