Should I merge my two datasets together or something else?
1
2
Entering edit mode
5.9 years ago
dally ▴ 210

I have a very fundamental question that I can't seem to find an answer to.

I have a variety of TF and histone marks for a untreated cell line and a treated cell line. I ran a tool that returned a bed file of 'potential' enhancer regions. The untreated cell line identified 36k possible enhancer sites while the treated cell line identified 34k. If I am interested in seeing whether a histone mark or two are enriched / de-enriched at these enhancer sites, should I merge these two datasets together to generate one large dataset? Or should I take only the common enhancer regions (intersectBed) that appear between both datasets?

Why is it 'correct' to merge them together as opposed to identifying common regions? Or vice versa? Or is there something else I should be doing that is entirely different?

I have not worked with untreated vs treated cell types before so I don't wish to proceed too far before determining this.

datasets enhancers bioinformatics • 1.5k views
ADD COMMENT
2
Entering edit mode
5.8 years ago

I would do check how many enhancer sites are common b/w two conditions

  • If high overlap (>90%), I would conclude, treatment has no significant effect and would proceed by the plotting the enrichment of histone marks/TF on the intersection or union (if >95% overlap) of enhancer regions
  • If low overlap or very different enhancer sets, I would say treatment has an effect and would elaborate what kind of enhancers are common and what are the "new" enhancer sites plus which ones got lost. You go forward with GO analysis of the neighbouring genes etc, for these groups. Once you are clear with your groups, go forward with the enrichment calculations.
ADD COMMENT

Login before adding your answer.

Traffic: 1345 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6