Get corresponding Biosample accessions for very large list of SRA accesions?
1
0
Entering edit mode
3.4 years ago

I have ~1000 SRR files in a list and I want to get the Biosample accession numbers for each of them without doing it manually. Any easy script to do this sort of thing?

Thanks!

next-gen • 1.0k views
ADD COMMENT
0
Entering edit mode
3.4 years ago
vkkodali_ncbi ★ 3.7k

You can use Entrez Direct for this as follows:

$ esearch -db sra -query 'SRR5437876' | elink -target biosample | efetch 
1: Human sample from Homo sapiens
Identifiers: BioSample: SAMN06710536; Sample name: MCF-7; SRA: SRS2116118
Organism: Homo sapiens
Attributes:
    /isolate="MCF-7"
    /age="69 years"
    /biomaterial provider="missing"
    /sex="female"
    /tissue="breast"
    /cell line="MCF-7 cancer cell line"
Accession: SAMN06710536 ID: 6710536

If you use the -format native -mode xml with the final efetch command, you can get the output in XML format that can be parsed using the xtract command, an Entrez Direct tool.

ADD COMMENT

Login before adding your answer.

Traffic: 2603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6