Question: What is the difference between GEO and SRA ncbi databases?
4.9 years ago
Croatia/Zagreb/Faculty of Electronical Engineering and Computing
matija.sosic wrote:

I found that GEO holds "processed sequence data files", while SRA holds "raw sequence data files". In which way processed?

I am interested in rna-seq bacterial data, is GEO right resource for me?

4.9 years ago
Chris Fields2.0k
University of Illinois Urbana-Champaign
Chris Fields wrote:

Depends on your needs.  GEO contains data relevant to a particular experiment and in most cases I believe represent processed data (e.g. have been run through one or more analysis steps, such as trimming, alignment to a reference, R/BioC, etc).  The example RNA-Seq from GEO is illustrative:

If you check under the BioSamples you'll find a Cufflinks file, a SAM file, BEDGRAPH, etc.  These would all be bound to a specific assembly version and whatever comes along with that (annotation, etc).

SRA however contains raw sequence data for an experiment, if you want to download and re-analyze the data on your own.  They are generally all tied together via a common BioProject ID.

EDIT: In that last sentence, by 'they' I mean any data relevant to the BioProject (BioSamples, SRA, assemblies, etc).

