What Features From Publishers Would Help Bioinformatics Folks?
5
12
Entering edit mode
11.7 years ago
Mary 11k

So I was looking into the new browser/viewer that's been integrated into Elsevier's papers. There was some discussion of it on our blog yesterday. But it's got me thinking about what would be really useful for taking data in a paper further, more easily and fluidly.

What tools and features would you want in an app from publishers that would make the lives of bioinformatics folks easier?

For me, better text mining tool integration would be great. I can't even believe how bad the search is just for authors sometimes. I like the idea of gene lists too, which could be quickly obtained in one fell swoop and used with other tools. It also stuns me that "big data" sets are not always easily linked to where they can be found in a live browser.

EDIT WEDNESDAY Aug 3: Just saw this tweet--a way to get useful stuff added, might be of interest to this community: RT @geneticsblog: . @PLoS and @mendeleycom Call for Apps: http://bit.ly/oc2NGL and http://bit.ly/nHYqNa

text literature • 3.3k views
3
Entering edit mode

What I want from publishers is a complete overhaul of the entire academic publishing system, starting with the abolition of the traditional journal article and ending with something that resembles Github. Unrealistic, me? ;)

0
Entering edit mode

I agree, I had the same problem finding the actual genome viewer. At some point they say it uses NCBI's genome browser.

0
Entering edit mode

On our blog there was some discussion of why it may not be visible, and people gave me screen shots. But I'm talking to Elsevier on getting access. I assumed it would be there on the demo ones even if I am not a subscriber.

0
Entering edit mode

Giggle. Good luck with that @neilfws. But I know you aren't alone.

9
Entering edit mode
11.7 years ago

The lack of a reproducible research standard is the #1 problem in scientific publishing.

If data and code were married to results using Sweave or other tools then stuff like the Anil Potti incident wouldn't have required a full scale covert investigation to uncover:

http://www.nytimes.com/2011/07/08/health/research/08genes.html

1
Entering edit mode

Great point. It also makes me crazy that I can't get the version number of the software I'm running in certain workflow tools. I keep asking for it anyway.

1
Entering edit mode

One of those answers I wish I could +2.

1
Entering edit mode

I disagree. The #1 problem in scientific publishing is the lack of universal open access; the #2 problem is that research data (reproducible or otherwise) is delivered in an unstructured format. Sweave, while kewl, is a technological red herring and not the solution to the major problems in scientific publishing.

0
Entering edit mode

One thing that I started to do in order for other people to reuse my data and code more easily: Makefiles that reproduce the results from raw data just by calling the default target, sometimes combined with Latex report generation that uses images generated by the scripts.

0
Entering edit mode

something like this should be required for papers: Reproducible Research: A More General Sweave?

0
Entering edit mode

I don't see how making journals open access or easier for robots to read would have prevented the Duke errors. Patients received the wrong chemo because some researchers weren't required to provide reproducible code.

0
Entering edit mode

The question here is about what could be changed in the scientific publishing industry to make bioinformatics easier, not what can be done to prevent errors in scientific judgement or malpractice. Bad scientists make mistakes intentionally or unintentionally. It is not realistic to think that just providing a latex file with some R code in it is going to solve this. The problem in this case lies with the researchers and reviewers of this work, not with the publishers or the publishing industry.

4
Entering edit mode
11.7 years ago

I want:

1. some semantics (RDFa?) in the abstract/paper.Not just:

"NSP3 interacts with RoXan"


but

<span property="http://purl.obolibrary.org/obo/INO_0000026">
interacts with
</span>.

2. a flag saying:

"the software/database/facts described in this paper is/are deprecated"

3. Requiring authors to contribute to wikipedia with the content of their article. (e.g: http://en.wikinews.org/wiki/RNA_journal_submits_articles_to_Wikipedia)

4. Requiring authors to post their softwares/tools in a public repository (github, etc...)
1
Entering edit mode

Up vote for all but 3)

0
Entering edit mode

@Jerven: that's interesting !:-) why not ? ( I've added a ref in my answer )

0
Entering edit mode

For (1), I'd prefer go the full semantic web way and add a proper qualifier as well.

3
Entering edit mode
11.7 years ago
Kim ▴ 100

Clear identification of the materials used or produced. This includes providing complete names or identifiers for the organism or sample, the gene, and the sequences mentioned - use available identifiers as much as possible (especially accession.version).

0
Entering edit mode

The accession version is a big question of mine, and how updates would be handled in the browser. If we are looking at giWhatever.2 in a paper, and the link goes to giWhatever.8 later, it could affect ones conclusions about the gene in that paper. Will end users know this?

0
Entering edit mode

publishers need to set a standard to cite accession.VERSION (or GI) and users need to realize that the accession alone doesn't identify a specific sequence at a point of time - only the combined identifier or GI can do that.

If you query NCBI with accession.version, and it is an older record, the results page does include a message indicating that and offering the choice to see that version or see the newest record.

2
Entering edit mode
11.7 years ago

In the 25-year SwissProt conference 2007, Amos Bairoch proposed that all papers should be sent to publishers with some mark-up language to identify genes, proteins and other data. I tried googling to see if I could find this project but had no luck. Basically it would remove the need for natural language processing on new papers and allow automatic data integration into databases such as those in imex as well as better custom searching.

Not so much a killer app, but a better method of publishing science. I'd love to see this implemented in some form but I guess forming standards between publishers would be a major headache.

0
Entering edit mode

Yeah, I've been in a variety of rooms with database folks who crave standards that would help them extract what they need to get into their databases. Since the mid-90s. Progress has been...um...slow. They have tried to work with publishers with mixed success.

1
Entering edit mode
11.7 years ago
Kim ▴ 100

About the Elsevier genome display - it is an embedded browser provided by NCBI. See here: http://www.ncbi.nlm.nih.gov/projects/sviewer/embedded.html

0
Entering edit mode

Yeah, but how it is implemented in the context of the papers, how it will be maintained and updated, and more--still need to be explored in situ.