Question: How to identify a certain pattern for histone marks
0
gravatar for dally
4.9 years ago by
dally190
United States
dally190 wrote:

So lets hope that you guys sort of understand what I'm attempting to do.

I am looking at three histone marks. H3K4me1 is one of these marks and it is used in many studies to identify probable enhancer regions.

I've noticed that when looking in IGV, I see some really nice enhancer regions that seem to have a reoccuring pattern, but I can't seem to find a way to identify these 'regions'.

I am trying to find H3K4me1 regions that have flanking peaks and a middle section of low signal. Similar to this: /\__/\

My question is: Is there a way to identify these regions using a already developed tool? Maybe something like bedtools? Or bedops? Essentially I am looking for regions of H3K4me1 where Pol can bind inbetween H3K4me1 peaks.

 

This is a good example of what I'm talking about (the region distal from the tss):

http://i.imgur.com/8NdOxUU.png?1

 

Any ideas? Or would something like this require some advanced scripting/coding and thus be unfeasible for a wet lab research assistant?

igv tools chip-seq bedops bedtools • 1.5k views
ADD COMMENTlink modified 4.9 years ago by Sachin Pundhir100 • written 4.9 years ago by dally190
6
gravatar for Sachin Pundhir
4.9 years ago by
Denmark
Sachin Pundhir100 wrote:

We have made a method for this purpose PARE ( http://spundhir.github.io/PARE/ ). In our experience, it should work well for your purpose.

UPDATE: PARE is now published. It can be accessed here: http://nar.oxfordjournals.org/content/early/2016/04/19/nar.gkw250.abstract

ADD COMMENTlink modified 4.6 years ago • written 4.9 years ago by Sachin Pundhir100

Hi there sachbinfo, this sounds extremely promising! Unfortunately, your link doesn't seem to be working. Could you update it and then let me know? I'm super interested in checking this out.

EDIT: I believe I found your github. Can you confirm? https://github.com/spundhir/PARE

EDIT2: Is having two replicates a requirement to use the program? I currently only have access to one H3K4me1 ChIP-Seq experiment dataset.

EDIT3: Could this PVP detection be applied to other histone marks such as H3K27Ac using their bam files, or would this not generate reliable data? (I'm not looking to predict enhancers using soley the H3K27Ac file, I'm more interested in a bed file of these Nuclesome Free Regions)

ADD REPLYlink modified 11 months ago by RamRS30k • written 4.9 years ago by dally190

EDIT 1: Yes, link is correct.

EDIT 2: Yes, it requires two replicates. However, you can pass around it by passing the same bam file as two replicates. As you can imagine, this would although compromise on the robustness of results.

EDIT 3: Yes, definitely, we have used it on H3K27ac also. If you have both me1/27ac, one alternative could be to give the two as two replicates then results would have NFRs defined by both these histone marks.

Hope this helps.

ADD REPLYlink modified 11 months ago by RamRS30k • written 4.9 years ago by Sachin Pundhir100

Hi Sachin,

I was looking for something similar to this and am glad I came across your program. However, I seem to be running into an issue when trying to replicate your test dataset. Here is the log:

Check, if all required parameters and files are provided (Tue Dec 15 11:05:47 CST 2015).. done
Create directory structure (Tue Dec 15 11:05:47 CST 2015).. done
Populating files based on input genome, hg19 (Tue Dec 15 11:05:47 CST 2015).. done
Determine number of bases by which to extend the 3' end of reads (Tue Dec 15 11:05:47 CST 2015).. done
/Users/Carlos/libs/PARESuite//bin/nfrAnaAll: line 206: syntax error near unexpected token `>'
/Users/Carlos/libs/PARESuite//bin/nfrAnaAll: line 206: `        blockbuster_threshold_nfr -i $REP1 -j $REP2 -k $PEAKREGION -l $TFSUMMIT -o $OUTDIR/optimizeThreshold -n $OUTDIR -g $GENOME -p $OPTION -c $EXTEND_REP1 -d $EXTEND_REP2 -z $BLOCK_BKG_PERCENTILE &>>$OUTDIR/logs/blockbuster_threshold_nfr.log'
cat: results/analysis/h3k4me1_helas3.All.nfr.sig: No such file or directory
cp: results/analysis/h3k4me1_helas3.All.nfr.sig.ucsc: No such file or directory

Here is the code I used to run the program:

pare -i /Users/Carlos/Desktop/Temp.PARE.Test/h3k4me1_helas3_Rep1.bam -j /Users/Carlos/Desktop/Temp.PARE.Test/h3k4me1_helas3_Rep2.bam -o results -m hg19 -p &> pare.log

Everything seems to run smoothly, but I am not getting any results output. The directory and macs2 directory and such are being created, there even seems to be a narrowPeak file generated for the reps but something seems to be going wrong.

Any ideas?

Differing number of BED fields encountered at line: 665423.  Exiting...

cat: results/analysis/H3K1.mapped.rgid.sorted.filtered.dups_remove.bam.All.nfr: No such file or directory
ADD REPLYlink modified 11 months ago by RamRS30k • written 4.9 years ago by Bioradical60

Hi Carlos,

Thanks for reporting the bug. It seems like using syntax &>> to redirect output is not supported in some shell versions. I have now made a new version of PARE (v0.05), replacing &>> with 2>&1 (more accurate way to redirect). PARE v0.05 is available for download from http://spundhir.github.io/PARE/.

ADD REPLYlink modified 11 months ago by RamRS30k • written 4.9 years ago by Sachin Pundhir100

Link fails because the close bracket is being included in the url

ADD REPLYlink written 4.9 years ago by Daniel3.8k
0
gravatar for jotan
4.9 years ago by
jotan1.2k
Australia
jotan1.2k wrote:

Can you call peaks and then cluster within x base pairs with BedTools?

ADD COMMENTlink written 4.9 years ago by jotan1.2k

This is something that could solve half the problem I suppose. The things that would be problematic for this just off the top of my head (which could be wrong since I'm not very familiar with this sort of thing) would be that you'd lose some true positive hits if you -d parameter is too small or false positives if its too large. I believe the cluster option also works similar to bedtools merge so that this would report peaks that overlap and show a /\ formation as well and not just regions that have the /\____/\ formation.

I know that this sort of thing is done with DNase Hypersensitive Site data right? Where open chromatin is determined by the 'low' region of DHS inbetween two large DHS peaks. I'm wondering if there is a way to apply DHS site recognition to histone marks such as H3K4me1. Again I have no idea if this is biologically sound, or even computationally possible / relevant.

ADD REPLYlink modified 11 months ago by RamRS30k • written 4.9 years ago by dally190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2029 users visited in the last hour