I am trying to reproduce the exon imbalance analysis from this paper (specifically, Figure 1C): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4184927/
Authors normalize raw coverages as described below (from supplementary materials):
Therefore, considering a matrix of coverages where rows are exons of gene X and columns are samples, the R code to perform the equations A and B should be:
#equation A, norm across exons coverage <- read.table("coverage.tsv", header=T, row.names=1, sep="\t", check.names=F) coverage <- t(t(coverage*nrow(coverage))/colSums(coverage)) #equation B, norm across samples coverage <- (coverage*ncol(coverage))/rowSums(coverage) coverage <- log10(coverage)
Is that right?
The supplementary table 4 provides raw coverage for gene NOTCH1 and NOTCH2 and the normalized values they use to perform the exon imbalance. However, using the code written above, I do not obtain the normalized values they have.
Do you have any clue what I am doing wrong?