Good Morning, Afternoon, and Evening wherever you are in the world!
Small Background: My name is Jason Mora and I stumbled here from /r/Bioinformatics. I am a fairly new recent graduate from California State University, East Bay where I did my undergraduate degree in mathematics with a minor in computer science.
Currently I am working on a bioinformatics project and would like some clarification. The data that I am working with consist of 16 features: 8 features are the various stress conditions including the control variable whose data are gene expression values (that are averaged per gene) and the other 8 are the respective p-values for those conditions i.e. Control and Control.Pvalue or Stress1 and Stress1.PValue and the rows are all of the genes for the organism. I was not involved in the calculation of the p-values so I am assuming the data is ready to go for data mining and interpretation. I read various papers in the research are we are studying and also approaches in analyzing biological data. I come across papers saying along the lines of "The log2FC is 3.45 (p < 0.01) ....". Now when exploring the data I see that there are genes who have p-values < significant (using < 0.05 for now) across all 8 conditions and some genes I notice in the data set whose p-value are significant for the stress conditions but not for the control condition.
Lets say Gene A:
Gene A Control Gene Exp: ###
Gene A Control P-Value: 0.31
Gene A Stress 1 Gene Exp: ######
Gene A Stress 1 P-Value: 0.001253
Gene A Stress 2 Gene Exp: ####
Gene A Stress 2 P-Value: 0.002512
My question is the following: When I want to calculate the fold change, do both the control's p-value and the stress conditions p-value must satisfy the significance level applied? Or does it only apply for the stress condition only??
Again, I am not working on the assays themselves, the data is already given in an excel file (as described above). I simply want to know whether I can calculate fold changes across the control and stress variable within a significance level or its safe to assume that the significance level for the control can be 'ignored'. I am using Python and many other python libraries to help me visualize and analyze this data set.
Any clarification would be greatly appreciated of this would be greatly appreciated! Any question you have, I would try my best to answer and clarify any concerns you may have.Thank you for taking the time to read this!