When dealing with dna sequences, one often comes across the scientific names of plants or animals, for instance well-known names like felic catus or utterly unfamiliar like Enterocytozoon hepatopenaei. Should I learn these names or at least the basics so as not to get confused? Or is this just redundant information, which is of little importance of a bioinformatican?
You probably should know the names of the species you are currently working, and maybe its closest relatives. More important than that, you should know relevant information about your species, like genome size, heterozygosity, repeat content, ploidy, and so on - not that all this information is available on the literature, anyway. This information will help you devise the best analytical strategies, and may prove important troubleshooting when things don't work down the road.
Is it relevant to your job to immediately recognize some large number of these? If not then the answer is "no". Most of us that studied biology don't know more than a handful of species names, that sort of thing is rarely useful for anyone to know. As a general rule of life, random facts like that aren't useful to memorize unless you want to take part in Jeopardy.
Almost certainly no. Beyond "Homo sapiens" and "Mus Musculus", there are probably only two or three I'd recognize. You'll pick up domain specific knowledge like that by diffusion from whatever project you're on, but a calculated study of them is probably not a good use of your time.
One additional opinion: When dealing with different unicellular species or even metagenomic samples there's often not much more than the scientific name for a species, starting with well known examples like Escherichia coli or Bacillus subtilis, two fairly common bacteria without trivial names, afaik.
However, you will learn the ones you need as you go, after my studies I didn't know more than others mentioned before. Today, due to the position I work in I know dozens.
I think it is good to know the scientific names of model organisms or highly studied plant and animals. for example
- rice (Oryza)
- maize(Zea mays)
- wheat (Triticum)
- zebra fish (Danio rerio)
- fruit fly (Drosophila)
Remembering those will not harm. These are a handful of those which you will hear or read about frequently on scientific forums, conferences and seminars.
Apart from that, as suggested by fellow biostar users, no need to remember all of them (it is impossible !). Just google on case by case basis.
To add to what others have said, knowing a lot of these is probably not useful, but being able to recognise the common ones will probably be helpful (know what they are when you see them, but neccesarily be able to write them yourself). I doubt anyone "learnt" these deliberately, they just become in grained over time, and everones list is different, but my list would be:
Homo sapiens (human)
Mus musculus (house mouse)
Drosophila melanogaster (fruit fly)
Danio rerio (zebra fish)
Caenorhabditis elegans (roundworm)
Arabidopsis thaliana (thale cress)
Saccharomyces cerevisiae (baker's/brewer's yeast)
Escherichia coli (the common model bacteria)
Whether you want to memorise the names or not, it's a matter of choice, time, effort you are willing to dedicate.
However we should comply with the the standards and recommendations on taxonomy, which in my view does share qualities with any ontology (i.e. hierarchy, standards).
Latin names for species should be in italics.
The genus component of the name should be in upper case e.g. Felis catus is OK, felis catus is not OK.
The species component of the name should be in lower case e.g. Mus musculus is OK, Mus Musculus is not OK.
Depending on the context, these rules could (would) be lifted. In you do code for example, you will not write those in italics.
I don't think there is such a thing as redundant information. The more you work with felic catus the more you will now it is Felis catus. The more you will know it
It's important to get the name right. It's UniProt, not Uniprot or uniprot. It may sound pedantic. It may be pendantic. But we need to use the right name. The wrong piece of code may make a difference, give you an error or give you wrong output. So details are important.
I look at information as relevant information. Some is very relevant information, others not so much so.
Note, we say mm10 for the genome assembly for mouse, because mm = Mus musculus.