Question: Combining ENCODE narrowPeak BED files across replicates
I want to overlap structural variants to ENCODE narrowPeak data. I have a few questions

  1. Anisogenic replicate

I see under one experiment there are two anisogenic replicates. This means that the two replicates are different individuals (having different genomes?)

  1. Combining replicates

This is something I should consider? If there are X number of replicates for a specific tissue I'm interested in should I apply a normalization for the signal

I'm also unsure on how to combine the replicates. This is what I'm thinking of doing.

  • Calculate Z-score for all scores listed in each individual narrowPeak file (column number 7: signal value)
  • Overlap the replicates and take the common positions.
  • For each common position among the replicates call the region as DNase hypersensitive if and only if 75% of the replicates have a Z-score > 3 (or 2?)

Then when I overlap these common peaks to my structural variants I can count the number of peaks and the total bases spanned for each variant.

I am unfamiliar in the world of ENCODE and noncoding regulation. Please tear me to shreds (or direct me to papers) if you find my proposal weak.

