Question: Best way to systematically select ENCODE data to download?
0
gravatar for Eric Lim
16 months ago by
Eric Lim1.4k
Stoke Therapeutics, Inc
Eric Lim1.4k wrote:

I normally use the following link to download ENCODE data.

https://www.encodeproject.org/files/{acc}/@@download/{acc}.fastq.gz

Using the online data selector is certainly one way to figure out the {acc}, but I'm wondering if I'm missing an easier way to batch download wanted data from ENCODE.

The .tsv provided by ENCODE has all the information I need to select wanted data, from experiments, assay types, species, etc, but I can't find anything that I can use to convert into accession ids.

Any advice?

encode • 450 views
ADD COMMENTlink modified 15 months ago • written 16 months ago by Eric Lim1.4k
2
gravatar for Eric Lim
16 months ago by
Eric Lim1.4k
Stoke Therapeutics, Inc
Eric Lim1.4k wrote:

I am primarily interested in their KD and control RNA-Seq, so I ended up writing a couple simple functions to retrieve the file IDs, given the experiment accession. Hope this might be helpful for someone.

import os
import requests

def get(resource,
        url='https://www.encodeproject.org/{}/?format=json',
        headers={'accept': 'application/json'}):
    return requests.get(url.format(resource), headers=headers).json()

def get_exp(exp_acc):
    def format(file):
        return [file['accession'], \
               file['paired_end'], \
               file['replicate']['biological_replicate_number']]

    response = get(os.path.join('experiments/', exp_acc))
    controls = set()
    for file in response['files']:
        if file['file_type'] == 'fastq':
            yield ['KD'] + format(file)
            controls |= set(file['replicate']['experiment']['possible_controls'])
    for ctrl in controls:
        response = get(ctrl)
        for file in response['files']:
            if file['file_type'] == 'fastq':
                yield ['Control'] + format(file)

from pprint import pprint
pprint(list(get_exp('ENCSR426UUG')))
ADD COMMENTlink written 16 months ago by Eric Lim1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1889 users visited in the last hour