Get fastq/sra from ArrayExpress and/or GEO programmatically for specific organism/experiment type
Entering edit mode
4.9 years ago
rioualen ▴ 590


I would like to get all the sequencing data for a specific organism and/or experiment type from ArrayExpress. I looked into REST queries here, and built the following request:"Escherichia+coli+K-12"AND"ChIP-seq"

If I get the accession number from each experiment, I can get a table summarizing the samples:<accession>/<accession>.sdrf.txt

However, the fields don't have fixed names. I need to get either the SRR and SRX identifiers, or the ERR one, in order to reach the SRA files or fastq files:

I would also like to do it from GEO, but then I need the GSE & GSM identifiers from the experiments, and I can't find them reliably either. This page seems useful but it doesn't say how to construct a query from scratch.

Overall, I'm completely lost by all the different types of identifiers and their connections...

arrayexpress fastq sra geo • 2.2k views

Login before adding your answer.

Traffic: 1687 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6