Let's say I would like to download from NCBI all genomes obtained for marine bacterial (or soil or gut associated). I figured that e-utilities could work for me.
Now, to get the information concerning the environmental source I should check the biosample. So I would do something like:
esearch -db biosample -query "marine" | efetch -format tabular 1: Photobacterium sanguinicancer CAIM 1827T Identifiers: BioSample: SAMN04252530; Sample name: CAIM1827T.1; SRA: SRS1159004 Organism: Photobacterium sanguinicancri Attributes: /strain="CAIM 1827" /host="Maja brachydactyla" /isolation source="Hemolymph" /collection date="06-Dec-2005" /geographic location="Spain: Ria a Coruna" /sample type="Bacterium" /altitude="0 m" /biomaterial provider="Collection of Aquatic Important Microorganisms" /culture collection="not applicable" /environment biome="marine" /host tissue sampled="hemolymph" /identified by="Bruno Gomez-Gil" /latitude and longitude="43.21 N 8.2200 W" /specimen voucher="not applicable" Description: Draft genome of Photobacterium sanguinicancer type strain CAIM 1827T Accession: SAMN04252530 ID: 4252530 .....
Now, I would like to either download this assemblies/SRA or to access them, and this is making me quite confused.
As far as I can read, I could use
efetch, to retrieve sequences. However, there seem to be not direct link between querying biosamples and accessing the data via e-utilities.
Is someone out there taht could illuminate me?