Forum:New to bioinformatics.....
2
0
Entering edit mode
5.0 years ago
sms.00196 • 0

Newbie Rant here: Am I the only one who thinks that the learning curve for the NCBI databases and tools is unnecessarily steep?

I want to obtain and use the information, not take hours to locate it.

There...I feel better now!

gene Forum • 1.9k views
1
Entering edit mode

Well, everybody wants to hit Enter and get the desired result:) Unfortunately (of fortunately?) it doesn't work like that, you have to put some (sometimes substantial) efforts.

0
Entering edit mode

Seems to me that there is a golden opportunity for private sector to develop a software tool that does exactly this. I state what question I want to answer, give the data I have, and hit Enter. The program does the rest--chooses the best search parameters based on the query and produces the most relevant output.

4
Entering edit mode

this has been the business model pushed by dozens of failed bioinformatics startups since 1996

2
Entering edit mode

They have all approached it the wrong way. What would really be a success is: someone gives the software the answer one wants, and hit enter. Then, the software finds the question and the data to satisfy the answer.

2
Entering edit mode

We have that, it's called simulation software :-)

0
Entering edit mode

Computer, can you give me the ultimate answer to life the universe and everything?

1
Entering edit mode

This actually makes a really good point. Multiple tools exist to get tasks done, and we take the call on which of them to use based on context and experience. When such a complex problem is given to computers, especially when we do not have well established and agreed upon training sets, machines will take too long to produce an answer that makes sense to the algorithm but not to the research world in general, AKA 42.

0
Entering edit mode

Fun fact: the ASCII code of 42 is *, or the "everything" wildcard.

1
Entering edit mode

You want private sector to develop an AI bioinformatician? Because that's what you're describing - an entity that takes the right call on methods to use given data and an idea on the kind of results you're looking for.

0
Entering edit mode

Yes, I think algorithms could do this. Sure, lots of work. But what a product! .gov is not going to have the incentive to do this.

4
Entering edit mode

You weren't lying, you are new :-)

Current AI technology cannot be trusted to do our job. They can't even be trusted to accurately sequence a genome, and that's just one machine doing one task.

0
Entering edit mode

Well said Ram! Full agreement

1
Entering edit mode

This is a complex issue I think. Of course it sounds sweet, but this software would cost a lot (it solves important problems, saves time and very complex), while the majority of scientific organization in the world (I don't mean top Western institutions) are poor :) And I guess that's why as mentioned by @Jeremy, we see a quite limited number of really successful and big bioinformatics companies today (and most of them are quite recent because of NGS revolution). We can keep discussing.

0
Entering edit mode

I've seen the 'products' of these companies and the analyses are invariably conducted incorrectly, with sloppy figures being produced, in addition. On top of this, they charge you both an arm and a leg.

0
Entering edit mode

Most of the bigger SaaS players (e.g. SBG, DNANexus) have taken a different approach than selling "we can make your analysis easy" to every PI and have made most of their money through partnerships either with government or pharma.

0
Entering edit mode

I partially agree. Learning the nuances of the differences between databases can take time, but for 'newbies' its typically enough to get comfy with (t)BLAST(n/p) and probably PubMed.

It doesn't get much simpler than a search box and a database though (which is exactly how NCBI is built), so I'm afraid there is no option other than to suck it up and get stuck in!

0
Entering edit mode

Maybe that's the problem. The input part seems simple, but interpreting the output is onerous. I want to spend time on science, not the technology. But, unfortunately, the reality is that the scientists and the technicians both need to understand the other. A nice partnership if you can get it.

0
Entering edit mode

Is it?

Personally, I think the output of a BLAST search for instance (I'd wager probably the single most used feature of NCBI's infrastructure, maybe after PMC), is incredibly intuitive. You even get a picture! And I'm not saying that as a now semi-seasoned bioinformatician/molecular biologist. I remember learning what BLAST was for the first time in my bachelors degree as if it was yesterday and it just 'clicked'. Granted, this is just a personal anecdote, and not everyone thinks the same way, but I really don't think you could ask for much more intuitive output.

My main criticism of the NCBI interface even now, is that the 'click through-y-ness' is opaque sometimes for sure. Trying to download all the genomes for a particular BioProject, or taxon or whatever, can require some pretty intimate understanding of how NCBI structures their data.

2
Entering edit mode

My main criticism of the NCBI interface even now, is that the 'click through-y-ness' is opaque sometimes for sure.

To be fair things you are listing are not simple queries/tasks. They inherently require clarity about what you are trying to achieve. Designing user interfaces (that are intuitive) is an art. I am sure NCBI keeps evolving those overtime. Because NCBI is so large a repository the regression testing they need to do must be a task in itself to ensure that things don't break. Since BLAST is their most popular tool they do keep that in pretty good shape/current/fresh.

0
Entering edit mode

Oh I quite agree, but even something that one might expect to be a simple process (like getting a full genbank download properly!) can trip you up.

0
Entering edit mode

Same issue in wetlab work. The techniques are complex and a field in themselves, but not basic science.
Primary reason I chose in silico, to get away from the burden of wetlab work. Unfortunately, I find its as bad or worse. Interested in others views on this.

4
Entering edit mode

So both wet lab and in silico work are too complex for you. Well, I have some bad news for you then.

3
Entering edit mode

Nonsense, this guy is executive material. He'll be my boss someday.

0
Entering edit mode

1
Entering edit mode

"What my team has developed under my expert leadership is a comprehensive solution for data-driven predictive personalized medicine and cost savings. Big Data. IoT."

1
Entering edit mode

I'm going to be sick.

1
Entering edit mode

Don't worry we have an app for that, too.

1
Entering edit mode

Can you elaborate on the wet lab point? I'm a wet lab biologist (primarily in fact). If anything I think the wet lab has too much of the opposite problem. The techniques are so old and ingrained in a lot of cases, everyone just buys kits and black boxes everything - it's pretty much impossible to understand the basic science of every tiny thing.

But that's how it has to be. Science is too big now for everyone to be an expert in very much at all.

1
Entering edit mode

Yes, I was also primarily wet lab based before branching into bioinformatics. Science in the wet lab has become a 'kit-based', any many of these kits are expensive and do not even work. Companies even sell them to you after the 'sell-by' date. So much money is wasted in research as a result of this, un-necessarily so. If companies actually properly tested their products better, instead of just releasing their own curated white papers, then it may improve.

0
Entering edit mode

The interface and use of most NCBI resources is reasonably easy, and with a bit of reading and searching you can even become a "power user" rather quickly.

If you are talking about SRA and SRAtolkit, on the other hand, I wholeheartedly agree.

0
Entering edit mode

You forgot to add NCBI unix command line utils (to the list with SRA) :-)

3
Entering edit mode
5.0 years ago
Ram 38k

Am I the only one who thinks that the learning curve for the NCBI databases and tools is unnecessarily steep?

Maybe? One of the skills a bioinformatician needs is on-demand learning. Focus on knowing what tools can do, not learning how to do everything before you start using them. NCBI is one of the easiest databases out there and its results are as clear as can be.

I want to obtain and use the information, not take hours to locate it.

What information? Bioinformatics databases are not service sites like Amazon or eBay, they are data sharing sites. We are not entitled to finding results relevant to us quickly. As time goes, people understand how these tools work and how to get what we need out of them the best way possible. If you have an idea on how to do something better, do it and we will support you. But remember, new tools don't always help:

1
Entering edit mode

My favourite XKCD cartoon of all time by far.

2
Entering edit mode
5.0 years ago
GenoMax 127k

Have you looked at the learning resources that NCBI has. Plenty of YouTube videos if you prefer to learn that way.