How to get information from the UHGG website by python
1
0
Entering edit mode
22 months ago
xzhang55 • 0

https://www.ebi.ac.uk/metagenomics/genomes/MGYG000000001#overview

Hi Guys,

I want to get the Genome statistics parts from the webpage and I don't know how to do it by python. I would appreciate it if anyone could help me. I tried by this but the output is nothing.

enter image description here

import requests
from bs4 import BeautifulSoup 
html = requests.get("https://www.ebi.ac.uk/metagenomics/genomes/MGYG000000001#overview")
soup = BeautifulSoup(html.text,"lxml")
# print(soup.prettify())
# print(soup.title)
# print(soup.head)
soup1 = soup.select("div.vf-grid.vf-grid__col-2")
print(soup1)
python UHGG • 651 views
ADD COMMENT
1
Entering edit mode
22 months ago

why scrapping when there is a REST API ?

$ wget -q -O - "https://www.ebi.ac.uk/metagenomics/api/v1/genomes?accession=MGYG000000001" | python -m json.tool
{
    "data": [
        {
            "attributes": {
                "accession": "MGYG000000001",
                "completeness": 98.59,
                "contamination": 0.7,
                "eggnog-coverage": 93.78,
                "ena-genome-accession": null,
                "ena-sample-accession": "ERS370061",
                "ena-study-accession": "ERP105624",
                "first-created": "2021-12-07T18:06:34.762463",
                "gc-content": 28.26,
                "genome-id": 4888,
                "geographic-origin": "Europe",
                "geographic-range": [
(...)
ADD COMMENT
0
Entering edit mode

Hi, I'm not very familiar with the API, could you please give me more details? I would like to download the "data" part as a csv file.

ADD REPLY
0

Login before adding your answer.

Traffic: 2640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6