SRAdb / SRA difference in overall sample counts
0
0
Entering edit mode
5.3 years ago
mayars169 ▴ 20

I am interested in obtaining metadata for all of the samples in SRA and using SRAdb, I ran into a couple of discrepancies I was hoping to get clarification on.

I downloaded a local copy of the SQLite database (getSRAdbFile()) and ran:

rs <- dbGetQuery(sra_con, "select * from sample")

This returned a table with 4910256 samples/rows.

If I go to the NCBI trace directory for SRA though (https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=samples&term=&page=1), it says "4527183 samples found." (383,075 difference).

I thought the reason for the difference might be that the trace directory isn't synced to SRA, but the graph on the statistics page shows it updated as recently as today.

Additionally, I used the listSRAfile function to obtain the ftp address for all sample accessions in the SRAdb sample table.

469,150 samples did not have an ftp address. I searched for several of them in SRA (by multiple identifiers) and couldn't find any hits. Ex:

165 "SAMD00013638" "DRS000178" NA "BioSample" 10116 "Rattus norvegicus" NA NA NA NA NA NA NA NA NA NA "sample_name: DRS000178 || strain: Wistar || dev stage: 3 days old || tissue appendix: Barrel cortex, all layers || tissue type: cerebral cortex" "DRA000155" "2016-11-20 05:18:13"

So I'm guessing that means they've been removed from the database, but that number seems very high. They have a range of prefixes and dates. I've looked through the manuals and all of the posts I could find here and haven't found anything directly relevant.

I think this is still going to be very useful, but it would be nice to be able to explain these differences. Does anyone with more experience with SRA / SRAdb know what's going on?

SRA SRAdb • 1.0k views
ADD COMMENT
0
Entering edit mode

You may want to send the question in to SRA-help just to be sure.

The metadata you are looking for may be available in this FTP directory as report files.

ADD REPLY

Login before adding your answer.

Traffic: 1523 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6