Question: Exponentially Increasing Genomes Slide
14
gravatar for Lee Katz
8.6 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

I always see a slide in talks what shows an increasing number of genomes available in GenBank or other database. Where is this slide from? I have seen an outdated one from Genomes Online but nothing recent.

How can I find this graph and cite it for my own talk?

genome graph • 3.7k views
ADD COMMENTlink modified 8.6 years ago by Bjoernsen40 • written 8.6 years ago by Lee Katz2.9k

I guewss the genomes online one is the best answer. Thank you for the boost on my question giovanni.

Maybe a better question would be, where are these data so that we can generate our own pretty graphs? But then again, I realize that the data are out there--you just have to find them and bring them together yourself!

Although, everyone gave really great answers and I learned a lot from going through your links and what you said. Thank you all!

ADD REPLYlink written 8.6 years ago by Lee Katz2.9k
11
gravatar for brentp
8.6 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

check here

ADD COMMENTlink written 8.6 years ago by brentp23k

what a pity the graph is so damn ugly!

ADD REPLYlink written 8.6 years ago by Yannick Wurm2.3k

I had to use internet explorer to get the numbers, but it's suggesting the relative growth rate is decreasing, and that 2000 was an outlier year (and obviously 1983).

ADD REPLYlink written 8.5 years ago by Andrewjgrimm440
10
gravatar for Mary
8.6 years ago by
Mary11k
Boston MA area
Mary11k wrote:

There was another really good graphic that Lincoln Stein used in his talk at Beyond The Genome last week. It is available from this paper:

The case for cloud computing in genome informatics

It is figure 2 in there. It shows the slope of sequence data pre-NGS, and the change recently. And also the point where we have now crossed storage vs production: we have now passed the point where we can afford to store it:

"The cost of genome sequencing is now decreasing several times faster than the cost of storage, promising that at some time in the not too distant future it will cost less to sequence a base of DNA than to store it on a hard disk....The various members of the genome informatics ecosystem are now facing a potential tsunami of genome data that will swamp our storage systems and crush our compute clusters."

Also at this meeting people were trying to change the meme from big scary data (deluge, tsunami, etc) to "data bonanza". People were attempting to use that--but they still seemed scared :)

ADD COMMENTlink written 8.6 years ago by Mary11k

I like the information added within this response very much.

ADD REPLYlink written 8.6 years ago by Larry_Parnell16k

lol, I wasn't there, but you can count me with the scared ones. I do have ideas, and a plan, for dealing with a certain amount of data growth. But if this keeps going indefinitely, where will we end up? That's what I'm afraid of. Is Pac Bio going to save me from short reads? Or are they just going to multiply the data volume? Or both at the same time, plus a continuing flood of 2nd-gen data?

Or more generally - what is the new equilibrium going to look like, and when are we going to get there? The fact that I don't know is what makes me nervous.

ADD REPLYlink written 8.6 years ago by Mitch Skinner660

I'm denied access to the article from a University of California :(

ADD REPLYlink written 8.2 years ago by Aleksandr Levchuk3.1k
8
gravatar for Daniel Swan
8.6 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

You might want to have a look at the statistics from GOLD the 'Genomes OnLine Database' here as this has statistics at the genome, not basepair level.

ADD COMMENTlink written 8.6 years ago by Daniel Swan13k

I just realized that they actually have the data in an Excel spreadsheet at the top of the page which is what I wanted. http://genomesonline.org/Gold_Stats.xls

ADD REPLYlink written 8.6 years ago by Lee Katz2.9k
6
gravatar for Pierre Lindenbaum
8.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

See Genome Project Statistic: http://www.ncbi.nlm.nih.gov/genomes/static/gpstat.html

update ... and the (rather incomplete) category in wikipedia Sequenced genomes : http://en.wikipedia.org/wiki/Category:Sequenced_genomes

ADD COMMENTlink written 8.6 years ago by Pierre Lindenbaum120k
4
gravatar for Yannick Wurm
8.2 years ago by
Yannick Wurm2.3k
Queen Mary University London
Yannick Wurm2.3k wrote:

This one is helpful too

http://www.genome.gov/sequencingcosts/

alt text

ADD COMMENTlink written 8.2 years ago by Yannick Wurm2.3k
1
gravatar for Bjoernsen
7.2 years ago by
Bjoernsen40
Bjoernsen40 wrote:

I recommend you to use diArk for the latest genome files. The stats can be found using http://www.diark.org/diark/statistics

ADD COMMENTlink modified 7.1 years ago • written 7.2 years ago by Bjoernsen40

That's a bunch of neat plots, thanks for sharing this.

ADD REPLYlink written 7.2 years ago by Khader Shameer18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2269 users visited in the last hour