Question: ENCODE ChIP-Seq Control Experiments
0
gravatar for ghazalhaddad
2.2 years ago by
ghazalhaddad0 wrote:

Hello, I have a question regarding the ChIP-seq data from the ENCODE Project available on goo.gl/uTlPgE I have downloaded all of the data available from Homo sapien transcription factor ChIP-seq experiments (which ends up being more than 2500 fastq files), and I want to do peak calling on these files after alignment, therefore, I would need a control file for each of the ChIP-seq experiment files to do peak-calling on. However, I don't know how to find the ID for the control experiment corresponding to each experiment, since there is no file on the ENCODE website that has all the control experiments for each experiment laid out. I was wondering if anyone has any idea about how to go about finding the control experiments for this very large number of downloaded files without having to look each experiment up separately on the ENCODE website.

I would appreciate any responses!

chip-seq • 883 views
ADD COMMENTlink modified 2.2 years ago by Ryan Dale4.8k • written 2.2 years ago by ghazalhaddad0

You'd have to go through them manually, each experiment has a control dataset specified. Well ... almost all of them do. ENCODE is funny sometimes and this right here is one of the most frustrating reasons.

ADD REPLYlink written 2.2 years ago by Sinji2.7k

That's the only way I have found to work as well, the problem is that it will take more time that I'm willing to spend to go through all the files I have downloaded and find their control experiments manually. I really hope there is another way!

ADD REPLYlink written 2.2 years ago by ghazalhaddad0
4
gravatar for Ryan Dale
2.2 years ago by
Ryan Dale4.8k
Bethesda, MD
Ryan Dale4.8k wrote:

I recently had to deal with this as well.

I typically interactively identify a data subset of interest on encodeproject.org and then get the general metadata from the first line of the file downloaded by clicking the "Download" button e.g.,

https://www.encodeproject.org/metadata/type=Experiment&biosample_term_name=HepG2&assay_title=ChIP-seq&limit=all/metadata.tsv

But this file doesn't contain info on controls. The trick is to use their REST API to query for individual accessions. Here's a working example that assigns controls to that metadata.tsv which you should be able to adapt for your use-case:

ADD COMMENTlink written 2.2 years ago by Ryan Dale4.8k

This works perfectly, thank you so much!

ADD REPLYlink written 2.2 years ago by ghazalhaddad0

Now the'possible_controls' has been changed to 'Controlled by'.

ADD REPLYlink written 11 weeks ago by Vanilla80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1149 users visited in the last hour