Hi, I have identified a SNP variant in a certain protein that has a RS number. I want to, using data in dbGaP or other databases to identify whether or not this SNP variant is correlated with poorer survival or disease outcome.
For example, I have the variant RS12345, and I would like to see if it is correlated with worse inflammatory bowel disease, or whether is it protective against IBD. How do I approach this problem?
Please assume that I have access to sequencing level data from dbGaP and that I have a high performance computing cluster.