We have just finished a blood based aptamer proteomics screen where we targeted ~7000 proteins. Following initial analysis where statistical significance between control and disease conditions was determined by FDR. We found some interesting hits, but some of the proteins previously shown to be involved in the disease were absent. We then extracted columns containing our most suspected proteins and performed t-tests on these individual columns. Some of these targets were found to be significantly different. My question is how valid is this approach? In other words, is it fair to subset a proteomics data set and perform t-tests when there are much fewer data points? I don't want to cherry pick data through my post hoc analysis, but some of these observations recapitulate what has been previously reported in the field. I also suspect I have an outlier problem that may reconcile some of these differences, but that is a different issue.
Thanks in advance!