Question

How to get information from the UHGG website by python

0

Entering edit mode

22 months ago

xzhang55 • 0

https://www.ebi.ac.uk/metagenomics/genomes/MGYG000000001#overview

Hi Guys,

I want to get the Genome statistics parts from the webpage and I don't know how to do it by python. I would appreciate it if anyone could help me. I tried by this but the output is nothing.

enter image description here

import requests
from bs4 import BeautifulSoup 
html = requests.get("https://www.ebi.ac.uk/metagenomics/genomes/MGYG000000001#overview")
soup = BeautifulSoup(html.text,"lxml")
# print(soup.prettify())
# print(soup.title)
# print(soup.head)
soup1 = soup.select("div.vf-grid.vf-grid__col-2")
print(soup1)

python UHGG • 651 views

ADD COMMENT • link updated 21 months ago by Ram 43k • written 22 months ago by xzhang55 • 0

score 1 · Answer 1 · 2022-06-19

1

Entering edit mode

22 months ago

Pierre Lindenbaum 161k

why scrapping when there is a REST API ?

$ wget -q -O - "https://www.ebi.ac.uk/metagenomics/api/v1/genomes?accession=MGYG000000001" | python -m json.tool
{
    "data": [
        {
            "attributes": {
                "accession": "MGYG000000001",
                "completeness": 98.59,
                "contamination": 0.7,
                "eggnog-coverage": 93.78,
                "ena-genome-accession": null,
                "ena-sample-accession": "ERS370061",
                "ena-study-accession": "ERP105624",
                "first-created": "2021-12-07T18:06:34.762463",
                "gc-content": 28.26,
                "genome-id": 4888,
                "geographic-origin": "Europe",
                "geographic-range": [
(...)

ADD COMMENT • link 22 months ago by Pierre Lindenbaum 161k

0

Entering edit mode

Hi, I'm not very familiar with the API, could you please give me more details? I would like to download the "data" part as a csv file.