I am currently working with the Fluidigm qRT-PCR data. There are 3 plates with total of 288 genes combined into one file (264 target genes + 8 * 3 (triplicate)= 24 Reference genes) with each plate consists of 96 genes (88 Target genes + 8 Reference genes) in one file. In summary, I have approximately 264 target genes and 8 * 3 = 24 reference genes in around 45 samples. Each of the samples are technically replicates.
I want to know if the methodology followed below is correct during the data analysis?
A. Handling multiple reference/housekeeping genes
Since, I have 8 reference genes in triplicate in each sample in a combined file, I created a data file with these reference genes across all samples,
- Average of on-chip reference genes at the each sample level (Arithmetic mean of 8 * 3 Reference genes leading to 8 * 1 Reference gene for each sample)
- Identify most stable reference genes for instance (top 4 ) using appropriate In-Silico approaches (geNorm, NormFinder, Bestkeeper etc) based on the M-value
- Create a Psuedogene by calculating the geometric mean of top 4 stable reference gene across samples
B. After this, 1. Create a new data file, average (arithmetic mean) technical replicates across all samples for remaining genes i.e with all the 264 target genes + psuedogene (geometric mean of top 4 reference gene)
Detector Target Gene 1 Target Gene 2 . . . PsuedoGene
- Calculate △Ct (Difference between the Target gene and reference gene (i.e psuedogene))
- Calculate △△Ct (Difference between the sample and average of control samples)
- Calculate 2^(-△△Ct) to evaluate fold gene expression levels
Please let me know if the above analysis methodology looks fine?