I am planning to analyse exome sequencing data. I am expecting data from the platform Illumina Infinium Human Exome-12 BeadChip.
I am basically statistician and have an interest in bioinformatics. I am designing a case-control study, where 100 cases of diabetes and 100 as control (no diabetes).
I have following questions;
- Would 100 cases and 100 controls be sufficient to identify the variants?
- Which software would be useful for analysing this type of data or any pipeline for Linux
I would really appreciate if you can help in this regards. Does this study make sense for the given data?
Is there any good reference for this type of analysis?
Sorry for many questions.
UPDATE: Recently, I found that the data is exome sequencing (mean 40x, agilent v4, HiSeq). What would suggest/comment on my above questions about sample size and software?