Question: Best way to systematically select ENCODE data to download?
0
gravatar for Eric Lim
9 months ago by
Eric Lim1.3k
Boston
Eric Lim1.3k wrote:

I normally use the following link to download ENCODE data.

https://www.encodeproject.org/files/{acc}/@@download/{acc}.fastq.gz

Using the online data selector is certainly one way to figure out the {acc}, but I'm wondering if I'm missing an easier way to batch download wanted data from ENCODE.

The .tsv provided by ENCODE has all the information I need to select wanted data, from experiments, assay types, species, etc, but I can't find anything that I can use to convert into accession ids.

Any advice?

encode • 294 views
ADD COMMENTlink modified 8 months ago • written 9 months ago by Eric Lim1.3k
2
gravatar for Eric Lim
9 months ago by
Eric Lim1.3k
Boston
Eric Lim1.3k wrote:

I am primarily interested in their KD and control RNA-Seq, so I ended up writing a couple simple functions to retrieve the file IDs, given the experiment accession. Hope this might be helpful for someone.

import os
import requests

def get(resource,
        url='https://www.encodeproject.org/{}/?format=json',
        headers={'accept': 'application/json'}):
    return requests.get(url.format(resource), headers=headers).json()

def get_exp(exp_acc):
    def format(file):
        return [file['accession'], \
               file['paired_end'], \
               file['replicate']['biological_replicate_number']]

    response = get(os.path.join('experiments/', exp_acc))
    controls = set()
    for file in response['files']:
        if file['file_type'] == 'fastq':
            yield ['KD'] + format(file)
            controls |= set(file['replicate']['experiment']['possible_controls'])
    for ctrl in controls:
        response = get(ctrl)
        for file in response['files']:
            if file['file_type'] == 'fastq':
                yield ['Control'] + format(file)

from pprint import pprint
pprint(list(get_exp('ENCSR426UUG')))
ADD COMMENTlink written 9 months ago by Eric Lim1.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1112 users visited in the last hour