Question: How To Find Sample Information From Srr Labels In Geo?
3
gravatar for user
4.8 years ago by
user770
United States
user770 wrote:

From GEO one can download SRR* files (ending in .sra) of illumina data that can be extracted as fastq with fastq-dump. how can the sample information for these SRR* IDs be read programmatically from the GEO/SRA metadata? The project SRP005601 has a sample "SRR097786" which is not described anywhere in the SOFT/minimal files - those files are incredibly complicated. How can I find the information describing the sample label from GEO?

the only manual solution I found to this is through http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=search_obj ("Search SRA objects") in NCBI trace. I type in the SRR* id manually and then click around until I find the sample information, for each sample. this is a terrible manual solution so I was hoping to download this metadata and parse it from a csv file.

ADD COMMENTlink modified 2.1 years ago by t.kuilman410 • written 4.8 years ago by user770
4
gravatar for Devon Ryan
4.8 years ago by
Devon Ryan79k
Freiburg, Germany
Devon Ryan79k wrote:

SRR097786 is the run, not the sample (SRA is confusing). Assuming you're using R anyway:

library(SRAdb)
sqlfile <- getSRAdbFile()
sra_con <- dbConnect(SQLite(),sqlfile)
res <- dbGetQuery(sra_con, "select * from sra_ft where run_accession='SRR097786'")

Most anything you want to know is in the various columns, which can be tailored. You can also just directly download the database and use it directly, if you prefer. If you only need to do this once, then you just have to click around on SRA to find what you want. It's an unfortunately complicated site.

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by Devon Ryan79k
0
gravatar for t.kuilman
2.1 years ago by
t.kuilman410
Netherlands
t.kuilman410 wrote:

I agree Devons solution is the most handy if you use R. However, it is missing some information (for instance, the number of spots for each sample) that you can get as follows:

wget -O SRP005601_metadata.csv 'http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?save=efetch&db=sra&rettype=runinfo&term=SRP005601'
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by t.kuilman410
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 644 users visited in the last hour