Hi!
I am working with a RNA-seq dataset and I have many genes with the corresponding FPKM values. I would like to dicotomize the values based on thresholds. For example, each value major than 0.5 becomes a 1 (expressed) and every value less than 0.5 becomes a 0 (not expressed). I determined this threshold for each gene and I stored this list of thresholds in a list, in Python. My dataset looks like this:
GeneA GeneB GeneC GeneN
x1A x1B x1C x1N
x2A x2B x2C x2N
xnA xnB xnC xnN
I would like to perform this operation:
df[df["GeneA"] < threshold] = 0
df[df["GeneA"] > 0] = 1
and each gene has its own threshold so what I am trying to do is an operation on each column of the dataset but each operation based on a value (threshold) that differs from column to column. Let's imagine that I have a list named "threshold" in which I have all the values.
Could you suggest me an effective way to do it? Thanks!