I'm trying to determine if a replicate is a technical or a biological one. I have ~3000 SRR (runs) files from the Roadmap project. The file structure of the directories I downloaded is in the following format:
SRX006235 represents an experiment's accession ID and
SRR018454, a run's accession ID. An experiment could've more than one run. My assumption is that these runs could be technical or biological. If I were to consolidate multiple replicates (for a single mark, H3k27me3 in a cell line for instance) to analyze the reads later on, I need to categorize the runs as technical or replicate. Since I've 3000 of these, I would like to automate the process.
I used GEOquery, GEOmetadb and SRAmetadb (R packages) to determine replicate information for a run but haven't been able to find it. SRAmetadb (http://gbnci.abcc.ncifcrf.gov/sra/) and GEOmetadb (http://gbnci.abcc.ncifcrf.gov/geo/) use supporting SQLite3 databases from Meltzer lab.
Does anyone know of a way to achieve this?
Thanks for reading!