Question: Get a complete GSM-to-SRX/SRR table
1
gravatar for predeus
18 months ago by
predeus630
Russia
predeus630 wrote:

Hello all,

When working with publicly available data one often has to first find them in GEO, and then download the raw reads from the SRA archive database. I've been using the Entrez Direct tools from NCBI for this.

For example, using Entrez Direct tools from NCBI, I can put in the following query:

esearch -db sra -query GSM1467783 | efetch -format runinfo

And get the following output:

Run,ReleaseDate,LoadDate,spots,bases,spots_with_mates,avgLength,size_MB,AssemblyName,download_path,Experiment,LibraryName,LibraryStrategy,LibrarySelection,LibrarySource,LibraryLayout,InsertSize,InsertDev,Platform,Model,SRAStudy,BioProject,Study_Pubmed_id,ProjectID,Sample,BioSample,SampleType,TaxID,ScientificName,SampleName,g1k_pop_code,source,g1k_analysis_group,Subject_ID,Sex,Disease,Tumor,Affection_Status,Analyte_Type,Histological_Type,Body_Site,CenterName,Submission,dbgap_study_accession,Consent,RunHash,ReadHash 
SRR1539207,2015-07-22,2015-12-01,37427735,1871386750,0,50,1424,,https://sra-download.ncbi.nlm.nih.gov/srapub/SRR1539207,SRX672143,,RNA-Seq,cDNA,TRANSCRIPTOMIC,SINGLE,0,0,ILLUMINA,Illumina HiSeq 2000,SRP045352,PRJNA257777,2,257777,SRS676636,SAMN02978909,simple,9606,Homo sapiens,GSM1467783,,,,,,,no,,,,,GEO,SRA178460,,public,B5573983CBB9C5E2046EB16D1C15A72F,25EC9BD6214D8EC0E7B270D41E6E861F

Now, I'm looking for a way to do it offline - to generate a big reference of GSM-to-.sra file correspondence, so by specifying the GSM ID or several, I can download all the relevant .sra file from the ftp.

I've found some R package wrappers around SQL databases; however, I'm puzzled that no table in SRAdb (I believe I've checked them all) includes GSM ids. I also haven't found the SRX/SRR ids in the GEOmetadb tables, although I might have missed something there.

At any rate, is there any database that establishes the correspondence between GSM and SRX/SRR identifiers?

If anybody can help me out with this, I'd be very grateful.

sradb gse gsm sra geo • 2.3k views
ADD COMMENTlink modified 7 months ago • written 18 months ago by predeus630
5
gravatar for Gregor Sturm
18 months ago by
Gregor Sturm50
Munich
Gregor Sturm50 wrote:

Have a look at the SRA ftp server. They provide a file called SRA_Accessions.tab which links various identifiers to each other.

With

grep ^SRR SRA_Accessions.tab | grep GSM

you can extract the SRR to GSM mapping.

ADD COMMENTlink written 18 months ago by Gregor Sturm50

This is exactly what I've been looking for. Thank you very much!

ADD REPLYlink written 18 months ago by predeus630
0
gravatar for predeus
7 months ago by
predeus630
Russia
predeus630 wrote:

Note that the table seems to be regularly updated - the most up-to-date file is now twice the size it was 11 months ago.

ADD COMMENTlink written 7 months ago by predeus630
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 755 users visited in the last hour