Hi guys, fairly new to RNA-seq. I used 3 custom made RNA oligos to use as spike ins in my tRNA-seq experiment. I want to normalize my dataset using these spike-ins and want to compare the normalization to a more profound normalization method (such as sizeFactors, used in DESeq2). So far, I've generated a count matrix for my spike-ins (using featureCounts), but I'm unsure of how to do the normalization. 1) How can I normalize my data using my spike-ins and 2) I already have my dataset normalized using sizeFactors in DESeq2. How do I compare my normalizations (spike-in vs sizeFactors) and are there any factors in particular that I should look out for? Thanks in advance!
You should put the matrix of the spike-ins along with the matrix of your results and then with DESeq2 you could use:
dds <- estimateSizeFactors(dds, controlGenes=spikeins)
where spikeins is a binary vector for the spike-ins in your matrix. You can start to compare them by looking at the distribution of the normalized values to see if they make sense. You should be aware that spike-ins don't normalize the entire technical process and you might still have differences resulting from technical issues. You can use code of this sort to plot the range of normalized values:
library(ggplot2) library(reshape) nrcounts <- counts(dds, normalized=TRUE) nrcounts$S <- row.names(nrcounts) %in% spikeins nrmlt <- melt(nrcounts, id.vars = c("S")) colnames(nrmlt) <- c("Spikeins", "Sample", "counts") ggplot(nrmlt, aes(Sample, counts, color=Spikeins)) + geom_boxplot() + scale_y_log10()