I am by profession a applied statistician. I have strong interest in analyzing next-generation sequences (NGS) data. I would like search and see a big picture about the challenges for statisticians. so far I have go through the recent issues related to bioinformatics journals, but still confused. Could you recommend some review article(s) or other material.
I think the commenters on your post are on the right track. You really need to specify the domain of application to even define the right challenges.
But just for the sake of it I will attempt to zoom all the way out and formulate what I think the main challenges are:
- Systematic errors - every single step of sample preparation and sequencing carries a bias and since we make hundreds of millions of measurements the effects of these are always visible and often stronger than the effects that are to be measured. For example: data collected on Mondays may be more consisten to each other than data on healthy patients.
- Multiple comparisons - we usually simultaneously measure all the components of the biological system - many (most) of which may still be unknown