Forum: Why Are There No Proteomics Questions In Biostar?
14
gravatar for Jdnavarro
8.5 years ago by
Jdnavarro410
Sevilla, Spain
Jdnavarro410 wrote:

I've been following BioStar for a while now and I've noticed the conspicuous lack of any proteomics questions.

For many years I've heard within the field that the main bottleneck of proteomics progress is informatics. But in spite of proteomics being heavily funded why this lack of interest in proteomics.

Potential factors that can contribute:

(I'm not affirming all these factors are happening, I'm just giving ideas)

  • Proteomics informaticians are not rewarded by usability of their software but only by publications. Other bioinformatics fields went through this step. Usability of proteomics software is still not important. There is no interest in real software only proof-of-concepts.
  • There are many proteomics informaticians but they don't have the inclination to share knowledge because it took too long to learn the idiosyncrasies and want to keep their exclusivity.
  • Proteomics data is inherently much more complex than genomics/pathway data. The reward of mastering proteomics data doesn't pay off. Very few bioinformaticians want to get into it.
  • Current proteomics software is awful. It's impossible to innovate using the current software as a foundation. PIs/experimentalists/employers always demand to build software on top of this crappy software, there is no
  • Mass Spec vendors with their propietary formats make impossible to build something on top of their software.
  • There are no proteomics informaticians because employers don't know how to recruit and manage bioinformaticians.
  • Proteomics is highly politicized. The moment an external bioinformaticians shows an alternative approach he/she is driven off by the proteomics community who sees the newcomer as a threat. All the funding goes to people who maintain the status quo.
  • There are many proteomics informaticians but they are too stressed to spend time on BioStar. If they are too stressed is because there is too much work for very few people. So this is not a real factor, why there are so few.

Have you experienced any of these factors? Why there is no interest in proteomics informatics? How do you see proteomics informatics from outside?

Update

Daniel Standage pointed out the lack of public accesible data compared to genomics.

biostar forum proteomics • 4.0k views
ADD COMMENTlink modified 4.4 years ago by Michael Dondrup46k • written 8.5 years ago by Jdnavarro410

I asked a proteomics question regarding proprietary vendor formats just the other day. I would not assume that lack of questions here equates to lack of interest elsewhere. There could be many reasons why we see less proteomics queries.

ADD REPLYlink written 8.5 years ago by Neilfws48k

I think you have posed and answered the question simultaneously. I certainly see many of the bullet points you have raised.

ADD REPLYlink written 8.5 years ago by Alastair Kerr5.2k

I only see 2 questions under proteomics tag, can't find yours... I know there can be many reasons but it puzzles me why in genomics you get sites like BioStar whereas in proteomics you can't find anything like that.

ADD REPLYlink written 8.5 years ago by Jdnavarro410

Fantastic summary. I do experience some of the factors, not all of them. That is however what I love about this field, there is so much work to be done, so many low-hanging fruits from the IT perspective.

ADD REPLYlink written 7.9 years ago by Roman Zenka10
7
gravatar for Brianbalgley
8.5 years ago by
Brianbalgley100
Brianbalgley100 wrote:

A full answer would require too long a post, but a few points:

The mass-spectrometry based proteomics market is relatively small, and proprietary formats have hindered development of non-instrument specific software. But this is changing - the folks at ProteoWizard have a tool to convert almost any proprietary format to one of the open formats (as long as you are doing it on a computer with the vendor software installed).

Industry jumped into proteomics around 2000 and it was too early - the methods, instruments and software were not ready. But they are taking another look now, this will help to drive proteomic software development.

There is excellent, open source proteomics software, TheGPM being one of the better examples, as well as the afore-mentioned ProteoWizard.

There is a LOT of freely available proteomics data out there. Some 13+ TB of raw and processed data on Tranche (some is protected, but increasingly it is open). And a trove on GPMdb.org made very useful with some custom tools.

Also, there have not been any major proteomics characterization efforts. But this, too is changing: http://grants.nih.gov/grants/guide/rfa-files/RFA-CA-10-016.html This effort will generate a large amount of data.

Overall, proteomics is still at a relatively early stage, maybe circa 1990-95 compared to genomics.

ADD COMMENTlink written 8.5 years ago by Brianbalgley100
6
gravatar for Daniel Swan
8.5 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

I know my colleague Simon would normally answer this, as he was recruited to our institution specifically to deal with proteomics and protein informatics. The simple fact of that matter is that certainly at our place of work, very little high-throughput work is being done. Dealing with piecemeal protein informatics is within the grasp of most bioinformaticians, but I think the number of people producing and needing to analyse vast quantities of high-throughput proteomics data is vanishingly small, compared to sequence based genomics, or even microarray array based transcriptomics.

ADD COMMENTlink written 8.5 years ago by Daniel Swan13k

I think nobody in bioinformatics expects to find proteomics informatics currently trained in the market. AFAIK most proteomics labs hire bioinformaticians that will be trained in proteomics data. But still there are not that many proteomics informaticians.

Do you think it's a problem of recruitment or that non-proteomics bioinformaticians are not interested in getting trained in proteomics?

ADD REPLYlink written 8.5 years ago by Jdnavarro410
6
gravatar for Mrawlins
8.5 years ago by
Mrawlins420
Retirement
Mrawlins420 wrote:

Getting started in proteomics is hard compared to genomics. I started out in LC-MS, then LC-MS/MS and ended up in RNA-Seq for next-gen sequencers. Understanding genomic and transcriptomic studies does not require much understanding of chemistry. You need to understand cellular processes to interpret experiments, and a little probability/statistics for sequence analysis, and then you're started in the field. Getting good requires more knowledge, but it's pretty accessible to a newbie to start messing around with.

Proteomics is tougher because the data are much more dependent on chemical phenomena. There is the chromatographic separation that has to be understood. In particular for MS/MS the fragmentation is very much a product of molecular orbital bond dissociation energies, which gets into some pretty awful quantum physics, physical chemistry and statistical mechanics. There are ways of abstracting this into something easier to compute (which is what everyone in proteomics has done), but that abstraction introduces some errors and bias that propagate through all the downstream analysis. The equivalent steps in genomics seem to be much better understood and most of the bias removed (though it still shows up in the error rates of next-gen sequencers).

Overall, proteomics is getting better and becoming more accessible as some of these core problems are worked through. It would be nice to see more community building in proteomics. I wonder how much of this, though, comes from the fact that most of what's on BioStar is related to genomics/transcriptomics, so proteomics people don't stumble across it, so they don't post to it, so they don't realize it's an option. I wouldn't be here if it weren't for my cross-over work in transcriptomics. Just my 2 cents.

ADD COMMENTlink written 8.5 years ago by Mrawlins420
5
gravatar for Daniel Standage
8.5 years ago by
Daniel Standage3.8k
Davis, California, USA
Daniel Standage3.8k wrote:

Interest may be a part of the issue, but it's not the heart of the issue in my opinion. One of the reasons bioinformatics is so badly needed is because it's not just J. Craig Venter who is creating "genomics"-scale data nowadays. Any Dr. Joe Schmoe with a couple thousand dollars can get his hands on gigabytes of raw sequence data, thanks to Illumina, Roche/454, et al. I think this is great, but it has had a huge impact on the approach many biologists take to their research.

Advances in high-throughput proteomics have not been able to keep up with their nucleotide counterparts. There are probably many reasons for this, and we need not assume that lack of interest is the primary one. But, as Giovanni said, it's a vicious cycle. People have more nucleotide data, so they spend more time developing nucleotide informatics, so industry focuses more on nucleotide-based omics platforms, so we get even more/better nucleotide data, and so on and so forth.

ADD COMMENTlink written 8.5 years ago by Daniel Standage3.8k

I take this answer as the lack of public accessible data. I forgot to add that factor

ADD REPLYlink written 8.5 years ago by Jdnavarro410

I think there is a lot of data in the field but there is reluctance to share the data. Why is that is another story.

ADD REPLYlink written 8.5 years ago by Jdnavarro410

I'm not sure it is a lack of shared data! I don't even think it is a reluctance to share data, but rather is more about what data do we share? From that aspect, the nucleotide side of things is way ahead of the proteomics field.

ADD REPLYlink written 8.5 years ago by Julian200

Not to mention the proprietary formats and the messy open ones.

ADD REPLYlink written 8.5 years ago by Paulo Nuin3.7k
4
gravatar for Delagoya
8.4 years ago by
Delagoya60
Delagoya60 wrote:

I think D Swan is spot on, there just isn't that much ongoing work at most institutions. I work on proteomics, sit right next to two groups that do large scale experiments, and we still only spend 1/4 of our effort on proteomics. The rest is spent on genomics and integration of past proteomics experiments with them.

ADD COMMENTlink written 8.4 years ago by Delagoya60
3
gravatar for Giovanni M Dall'Olio
8.5 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

For my experience, people working on proteomics are more ofter computer scientists than researchers with a biological background. A few years ago I gave a talk on bioinformatics to a Python Programming conference, and the majority of the people who approached me were computer scientists working on proteomics. Maybe this is because what you say in your post, that proteomics require better programming skills, and the biological background may be less important when developing a software to read the results of a Protein Fingerprinting. So, it may be that this website is frequented more by bioinformaticians with background in biology.

Moreover, it is a vicious cycle.

Some time ago I was the moderator of an italian web-forum on Science. Until a certain point, we had very few questions about cancer therapies; however, a certain day some people asked a few questions about that. The way our webforum was indexed by google changed drasticly, in a few days: we were indexed first for 'cancer' in italian, all the google/ads changed and started announcing therapies for cancer, and a lot of new users came asking for the same topic. After some discussion, since none of the moderators was a medic and we were not able to judge the quality of the answers given (beware, on Internet there are a lot of people selling false therapies), we had to forbid any direct medical question in the forum, and after a week, the situation came back to how it was before.

So there are few people working on proteomics who are aware of this website, while this website is frequented by a lot of people experts in other fields; therefore there are few questions on proteomics here. Then, google and the other search engines have not given an high score to this website for the proteomics-related queries, so few people interested in proteomics see this website; and so on.

If you want more proteomics-related questions here, just ask a few ones yourself, using a correct title and hope that it gets indexed by the search engines correctly. It may work very quickly.

ADD COMMENTlink written 8.5 years ago by Giovanni M Dall'Olio26k

Initially I started following BioStar because I thought I could be helpful to people trying to get into proteomics. I think it's a good idea to start asking some basic proteomics questions that I could imagine anyone would have when starting in protoemics and see how search engines pick them up.

ADD REPLYlink written 8.5 years ago by Jdnavarro410
1
gravatar for Bio_X2Y
8.4 years ago by
Bio_X2Y3.7k
Ireland
Bio_X2Y3.7k wrote:

I think a lot of questions that people have in relation to proteomics software have already been addressed on other sites.

For example, the Seattle Proteome Center has a google group for the discussion of their tools (http://groups.google.com/group/spctools-discuss). It currently has approx. 9000 threads, so is a very good resource for asking proteomics questions. Many questions aren't specifically related to the SPC tools - it's basically a place where the whole proteomics software ecosystem is discussed.

ADD COMMENTlink modified 8.4 years ago • written 8.4 years ago by Bio_X2Y3.7k

I used to follow that group some time ago. I found it too specific of TPP but will give it a try again. Thanks for reminding me that one..

ADD REPLYlink written 8.4 years ago by Jdnavarro410
1
gravatar for dario.garvan
4.4 years ago by
dario.garvan440
Australia
dario.garvan440 wrote:

There is a lack of a benchmarking community to compare alternative methods using a single dataset. For DNA and RNA analysis there are projects such as SEQC and DREAM Challenges. It's also hard to develop improved methods when there are few details of how the exiting methods work, such as ProteinPilot and Mascot. The only way is to demonstrate improved identification and quantitation, which requires a dataset based on known proteins and dilutions of them. That's something which the proteomics community never generated.

There are also no good review articles to describe the statistical and computational challenges. A search of PubMed for proteomics with filtering for review articles shows plenty of reviews from the chemistry and biophysics fields, but almost none from bioinformatics.


 

ADD COMMENTlink written 4.4 years ago by dario.garvan440
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1088 users visited in the last hour