My lab collected blood plasma cfRNA samples from breast cancer patients and non-cancer patients as controls. The PI designed a custom gene-chip to sequence 65 genes he predicts will be upregulated in cancer patients. He did this to save cost (cheaper than sequencing entire transcriptome) and to reduce noise (since his previous study showed that cfRNA data can be very noisy).
We now have the data and I'm meant to start analysing it, but I have no idea how to normalise it...
No housekeeping genes were sequenced, and many of the genes are expected to be differentially expressed. This makes TMM, RPKM, and other commonly used methods like DESEq2 inappropriate.
Any idea what I could do?
I thought perhaps to CLR or log transform it and then doing a Welch t-test between the two groups.
Thank you in advance for your feedback.