Question: Request for sites which are not promoter and enhancer as negative data for classifier
0
gravatar for na.cna30
3.9 years ago by
na.cna300
na.cna300 wrote:

Hello everyone:

I am looking for negative data for my classifier. I am trying to find specific enhancer (stat1) in human genome. I want human regions which are not regulatory regions and histone modification associated regions.

I would appreciate if someone suggest me such negative region for hg18?

thanks.

ADD COMMENTlink modified 3.9 years ago by aditi.qamra230 • written 3.9 years ago by na.cna300
1
gravatar for aditi.qamra
3.9 years ago by
aditi.qamra230
Toronto
aditi.qamra230 wrote:

Are you trying to find enhancers in a specific cell type ? You can get enhancer data from fantom database of other cell types ( say blood cells for comparison with  liver tissue ) to increase specificity of the classifier in your cell type.

Alternatively a gross estimate of negative regions could be take all enhancer regions  from ENCODE ( incase you dont want any regulatory regions - extend this to regions for marks - H3k4me3, me1 and 27ac ) across all cell types and get a list of regions that dont overlap any of these. 

I would be more comfortable with a tissue specific approach because the presence or absence of a histone mark and "thus a regulatory region" is too broad and dependant on the protocol, tissue, thresholds etc. 

ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by aditi.qamra230

thanks for replying. I have coordinate of specific enhancer (STAT1) for Hela cell. How can I get list of regions which are not regulatory regions? could you explain more about tissue specific approach?

I am trying to identify stat1 regions based on histone marks, but my classifier can't predict well after training.(my neg data is random seq) .thx

ADD REPLYlink written 3.9 years ago by na.cna300
1

if I understand you correctly - you are trying to identify all STAT1 enhancer regions on basis of histone marks - I'm not sure of how is that going to work. Nonetheless, to answer your question, 

How can I get list of regions which are not regulatory regions? --  You can use complementBed to get list of all regions that dont overlap with the list of regulatory regions you source from encode/fantom/in-house data etc (https://bedtools.readthedocs.org/en/latest/content/tools/complement.html)

I don't think I understood your objective for the classifier, plus in Hela cells, so what I was saying about comparing it to enhancer regions from other tissues doesn't really hold. But the idea was that if you are building a classifier for enhancer regions in say liver tissue, you might want to use histone signals from the enhancer regions in an entirely different cell type .. say blood cells to get a negative control since we know enhancers are related to cell identity.

 

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by aditi.qamra230

thanks so much and very helpful.

yes i am trying to identify stat1 regions on basis of histone mark, i am training classifier based on sequence contents of histone marks. my cell line is Hela cells.

is there any online tool like complementBed to provide list of non-overlapping regulatory regions? this tool working in Linux and OSx machines, i am windows user. I just want non overlapping regions in Hg18.

thanks again.

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by na.cna300

If you don't have access to any unix machine, you can try using Galaxy (https://usegalaxy.org/)

ADD REPLYlink written 3.9 years ago by aditi.qamra230

could you tell me how can I generate non-overlapping regulatory region in galaxy?

thanks for your help

ADD REPLYlink written 3.9 years ago by na.cna300
1

As I mentioned you can use complementbed in galaxy. What part is not clear ? If you opened the link I provided in my answer and browsed through the options on the left hand side.. You would have seen "Operate on Genomic Intervals" under which there is an option of  "Complement intervals of dataset" .I am happy to help incase you are stuck at some point but it feels like that you did not research this on your own at all. A simple google search would have landed you at https://wiki.galaxyproject.org/Learn/IntervalOperations

 

ADD REPLYlink written 3.9 years ago by aditi.qamra230
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1151 users visited in the last hour