I'm writing to ask if anyone is aware of database backed programs for management of SNP data, besides the ones that I list below. To a first approximation, the purpose of such a tool is to import data (which often requires validation) from upstream source files into a database, and then export it into a form usable by analysis.
As discussed in the paper, tools like PLINK expect the data to already be in a format they can use. However, getting data into that format can be a nightmare, especially if the data is dirty. So, such a system would be a supplement to existing tools. Of course, once the data is in the database, it can used for other things.
I have looked at other software that does this, but only found two,
namely SNPLims: a data management system for genome wide association
studies, and GWAS
Analyzer: integrating genotype, phenotype and public annotation data
for genome-wide association study
However, the lead author of
SNPLims told me the source code is
GWAS Analyzer has (in my opinion) major usability
issues. I'm using the source code available
I am not aware of any other systems. I find it hard to believe a system like this is not in standard use - perhaps I am missing something. It seems entirely possible that other systems have been created but are proprietary or have simply not been written about.
So, I'm writing to ask if anyone has written or is otherwise using a system like this, aside from those listed here, or if not, is aware of one. Thanks.
EDIT: Updated with the recently published PLoS ONE paper. Note: I'm also trying to upload my SNPpy paper to arXiv, but they have some annoying endorsing procedure, where someone has to endorse me who has recently (at least 2 papers in the last 5 years) uploaded papers to the Quantitative Biology section in arXiv. If you can help, please add a comment. Thanks.