I am trying to get the list of taxids of all coronaviruses which I plan to use in a script. Equivalently, the taxids of all viruses related to SARS-CoV is also a good starting point.
However, I do not know how to extract this information efficiently. I can find the following links with NCBI, however these links do not provide the information in a text/tabulated format that I can transport to a script: 1. SARS: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Tree&id=694009&lvl=3&keep=1&srchmode=1&unlock 2. Coronaviruses: https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&id=11118&lvl=3&lin=f&keep=1&srchmode=1&unlock
I would need to click each organism in this page to find the taxid of an individual virus. Is there a more machine-readable source for this information? Alternatively, I may need to use some scripting to click each link etc - is there some efficient way to do that? I can think of the following, but both seem convoluted: 1. Inspect source of the page to find all links related to individual viruses 2. Use lynx browser on a terminal
Thanks! I wasn't aware of this utility before. It will probably be helpful for other things I want to do as well.