Hello all,
Please forgive my ignorance. I am a novice in tinkering with RNAseq or bioinformatics.
I have read a few posts suggesting to follow standard pipelines for analyzing DEGs in a given experiment. However, I am not in possession of the raw data, and have been given only file containing TPM. I have been told to do a t-test, adjust for multiple testing, and filter for DEGs from this list.
My questions:
a) Is this a valid approach?
b) I have read that for microarray data, first log2 transformed and then analysis for DEGs. Can a similar approach be taken in my situation? My concern is that there are several (many) rows where 50% of TPM values are 0. Would it be wise to remove these rows?
c) if t-test suffices, and log2 transformation is not required, do I acquire LogFC values as: mean(log2(test))-mean(log2(control))?
If a similar question has already been asked, please feel free to disregard. I'm still going through the forums to try and figure out what is the best approach (in my case).
Thank you
If you have normalised expression values and still want to identify DEGs, you can try NOIseq which is a non-parametric method. You can directly use the TPM matrix as input. The manual has a nice step by step workflow.
Thank you for your response. I shall look into this.