Question: is there a way to get SRA info (Paired Library, Sample id, Seq Tech... etc) using a list of SRA IDS?
0
gravatar for S AR
24 days ago by
S AR50
Pakistan
S AR50 wrote:

I have a 1000 ids list of sra ids and i want to get the information of all ids in a table format. The info available at the sra read page like below:

https://www.ncbi.nlm.nih.gov/sra/?term=ERR038738

I need:

Read Id: ERR038738
Library:
Name: 2496237
Instrument: Illumina HiSeq 2000
Strategy: WGS
Source: GENOMIC
Layout: PAIRED
Experiment id: ERX015934
Sample accession ERS023468
Study accession ERP000520
Sample: 19744-sc-2011-02-15-1079093

Can any body help?

awk sra • 118 views
ADD COMMENTlink modified 24 days ago • written 24 days ago by S AR50
2
gravatar for Sej Modha
24 days ago by
Sej Modha3.9k
Glasgow, UK
Sej Modha3.9k wrote:

You can download SRA data in runinfo format that provides a comma-separated tabular output.

esearch -db sra -query ERR038738|efetch -format runinfo
ADD COMMENTlink modified 24 days ago • written 24 days ago by Sej Modha3.9k

Parallelized:

function mymeta {
  esearch -db sra -query $1 | efetch --format runinfo | tr ',' '\t' < /dev/stdin
}; export -f mymeta

cat accessions.txt | parallel -j 4 mymeta {}

From there, you can awk around as you like. By the way @OP, this solution I provided you already in your previous question on how to bulk download files. It was part of the command I suggested. Too bad you apparently did not invest time to understand how the command worked because you could have solved this question here yourself.

ADD REPLYlink modified 24 days ago • written 24 days ago by ATpoint11k

@ATpoint I tried it but it is not giving me any info in it. Your command worked on the file that i created in linux but as i give a bulk of ids it started giving errors. But i broke the list in 3 halves and i was able to download data. But it is not giving me information which for which im asking here.

ADD REPLYlink written 23 days ago by S AR50

And now i looked it that if i remove that coloumn cutting command it gives me the info table. Thanks Again ATpoint for help. Can you please explain the above command ? what is mymeta?

ADD REPLYlink written 23 days ago by S AR50

As by reading i understand that you have made a function with the name mymeta. but this can be run in linux directly i guess. should i make it a python script? does python have builtin esearch efetch module??

As when i tried to run it as it is by making a bash script it is saying:

/bin/bash: mymeta: command not found

ADD REPLYlink modified 23 days ago • written 23 days ago by S AR50

I tried from your previous command the following:

cat ../MDR.txt | parallel -j 4 "esearch -db sra -query {} | efetch --format runinfo"

It is f=giving me the results but it is again again getting the headers for each entry as well. Is there a way to get the headers once only in the start and the values of each sra ids in each rows.

ADD REPLYlink written 23 days ago by S AR50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1495 users visited in the last hour