I'm getting ready to release an R package that makes use of some moderately large genome annotation files. (Let's say 500MB-1GB total) I'd like to host them somewhere other than our lab's website, because when I move on, I plan on continuing to maintain this package.
Scripts are easy to throw up onto github, google code, etc, but these larger files may be a problem. I know github, for example, limits the storage space available to its non-premium users.
So what's the best place to host these files long-term, given that I'd like the hosting to be free, accessible for download without intermediate screens (so my script can pull them down) and have reasonable tolerance for taking up storage space and bandwidth?
Agreed with everyone's concerns; it's the ol' chicken and egg problem. To have confidence in a site you want to see lots of use and great data, but to get people using it you need to develop confidence. The best thing we can do is to get the word out and use them when workable.
I agree. Data repositories like Dryad are the way to go since you can wrap your files in searchable meta-data and provide links to the published article
Biotorrents is a great idea but there's never been any good data on there.
I'm excited about up-and-coming projects like this, but I'm a little hesistant to go with a new site - who knows if it will be around in a few years?