Question: ActiveEnhancers and Gene expression
0
gravatar for carlosalfonsogonzalez6
16 months ago by
carlosalfonsogonzalez610 wrote:

Hello

I am trying to understand the impact of an specific active enhancer region to nearby genes, what im trying to have it to obtain is which nearby genes to the enhancer region are down regulated or upregulated in control vs experimental condition. Do you have some advice or a library on which i could integrate both RNA-seq and Chip-Seq data and identify a posible association between TF binding and expression of nearby genes.

Thanks!

rna-seq chip-seq • 644 views
ADD COMMENTlink modified 16 months ago by jared.andrews071.9k • written 16 months ago by carlosalfonsogonzalez610
1

Here are some steps to follow

  1. Assign each enhancer to a gene (This can be done by two ways, one is distance based, take nearest 2 genes up & downstream or assign all genes to an enhancer if found in a TAD region, check this paper (methods section))
  2. Perform a correlation test between normalized signal of your chip-seq and normalized gene expression.
  3. Be little flexible with correlation cut-offs.

Recently, I was doing this kind of analysis, let me know if you need more details.

ADD REPLYlink modified 16 months ago • written 16 months ago by venu6.1k

I would be very carefull with this. When analysing different types of chromatin interaction data just talking the nearest gene is only slightly better than randomly selecting genes (AUROC values in the 0.5-0.6 range) - see fx fig 3b-e in this recent article.

ADD REPLYlink written 16 months ago by kristoffer.vittingseerup1.6k

I have data from same stage and same lab couple in time RNA and Chip experiments, What do you think? Im building a core for machine learning

ADD REPLYlink written 16 months ago by carlosalfonsogonzalez610

Thanks a lot!! Can you recomend some package or library to work on that, preferably on R?

ADD REPLYlink written 16 months ago by carlosalfonsogonzalez610

To help you out I need to know which cell type and organisme are you working with?

ADD REPLYlink written 16 months ago by kristoffer.vittingseerup1.6k

Im working with Drosophila in blastoderm stage of four TF and RNA seq from same sample same organism.

ADD REPLYlink written 16 months ago by carlosalfonsogonzalez610
3
gravatar for jared.andrews07
16 months ago by
jared.andrews071.9k
St. Louis, MO
jared.andrews071.9k wrote:

There are many different methods/packages for doing things like this out there. A little searching will yield many a blog post, Biostars questions, and publications. My usual workflow for this usually goes something like this:


1.) Identify differentially bound beaks.

Assuming you've already called your peaks (with MACS, HOMER, spp, etc), this can be done with software like DiffBind (if you have replicates/many samples) or MAnorm (single samples). They'll derive a consensus peakset and compare it across your control vs treatment conditions. They're also pretty easy to setup and use. This will yield a set of differentially bound peaks.


2.) Identify differentially expressed genes.

It seems most people have tried to move away from alignment-dependent RNA quantification tools (cufflinks2, etc) lately towards inference-based estimation methods (salmon, kallisto) followed by a typical differential gene expression package like DESeq2, edgeR, or limma. These have the advantage of also being much quicker. This will yield a list of differentially expressed genes, which you can then filter/rank by magnitude/p-value.


3.) Identify differentially bound peaks that correlate with differentially expressed genes.

This becomes a little trickier as we don't know your experimental setup, what TF you're ChIPing, or what your control and experimental conditions are. Regardless, I usually take a simple approach first, just looking at the closest differentially expressed genes to my peaks with bedtools' closest or BEDOPS closest-features. This will give you a list with the closest gene to each peak, though it's important to remember that the target gene of a given regulatory element may be up to 1000kb away.


4.) Visualize groups and pathway analyses Once I have these lists, I try to visualize them across all of my samples to pick out sites/genes that are robust and recurrent. I've grown found of EaSeq for visualizing signal at peaksets quickly in a variety of ways - heatmaps, genome-wide signal profiles, and more. It's also good for looking at individual loci/genes if you're interested in specific examples.

I also usually run my peaks through GREAT, which performs pathway and GO enrichment analyses or genes near your peaks. At a minimum, it usually helps you determine if your results make sense biologically.

This is a rather generic and vague guide, but hopefully it helps you get started.

ADD COMMENTlink written 16 months ago by jared.andrews071.9k

Hi thanks a lot

I have already made de peak calling on MACs and annotate the genome and pathways with ChipSeeker i have the genes. I have plotted nearby gene expression from selected enhancers manually but i want to do it massively, the idea its to find a correlation between enhancer divergence between two species and associate how this affects some gene expression on order to build a model.

ADD REPLYlink written 16 months ago by carlosalfonsogonzalez610

You want to do enhancer divergence between species? Which species? Build a model of what? More info will yield better answers as we'll better understand what you're trying to achieve.

ADD REPLYlink written 16 months ago by jared.andrews071.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1939 users visited in the last hour