I wrote a script on how to analyze the microarray-based miRNA expression data. Here is my code:
# general config
baseDir <- '.'
annotfile <- 'mirbase_genelist.tsv'
setwd(baseDir)
options(scipen = 99)
require(limma)
# read in the data
targets <- read.csv("/media/mdrcubuntu/46B85615B8560439/microarray_text_files/targets.txt", sep="")
# retain information about background via gIsWellAboveBG
project <- read.maimages(targets,source = 'agilent.median',green.only = TRUE, other.columns = c("gIsWellAboveBG","gIsSaturated","gIsFeatPopnOL","gIsFeatNonUnifOL")
)
# annotate the probes (gene annotation)
annotLookup <- read.csv(
annotfile,
header = TRUE,
sep = '\t',
stringsAsFactors = FALSE)
colnames(annotLookup)[1] <- 'ProbeID'
annotLookup <- annotLookup[which(annotLookup$ProbeID %in% project$genes$ProbeName),]
annotLookup <- annotLookup[match(project$genes$ProbeName, annotLookup$ProbeID),]
table(project$genes$ProbeName == annotLookup$ProbeID) # check that annots are aligned
project$genes$ProbeID <- annotLookup$ProbeID
project$genes$wikigene_description <- annotLookup$wikigene_description
project$genes$ensembl_gene_id <- annotLookup$ensembl_gene_id
project$genes$targetID <- annotLookup$TargetID
project$genes$mirbase_accession <- annotLookup$mirbase_accession
project$genes$external_gene_name <- annotLookup$external_gene_name
# perform background correction on the fluorescent intensities (subtract the background)
project.bgcorrect <- backgroundCorrect(project, method = 'normexp', offset=16)
# normalize the data with the 'quantile' method
project.bgcorrect.norm <- normalizeBetweenArrays(project.bgcorrect, method = 'quantile')
# filter out control probes, those with no symbol, and those that fail (probe filter)
Control <- project.bgcorrect.norm$genes$ControlType==1L
NoSymbol <- is.na(project.bgcorrect.norm$genes$ProbeName)
project.bgcorrect.norm.filt <- project.bgcorrect.norm[!Control & !NoSymbol,]
# for replicate probes, replace values with the mean
# ID is used to identify the replicates
project.bgcorrect.norm.filt.mean <- avereps(project.bgcorrect.norm.filt,ID = project.bgcorrect.norm.filt$genes$ProbeName)
#The normalised expression matrix is then contained in project.bgcorrect.norm.filt.mean$E
#Create the study design and comparison model(DEG filter)
Group <- factor(targets$Treatment, levels=c("Normal","Diseased"))
design <- model.matrix(~0+Group)
colnames(design) <- c("Normal","DiseasedvsNormal")
#Checking the experimental design
design
#Applying the empirical Bayes method with t-test
fit <- lmFit(project.bgcorrect.norm.filt.mean, design)
contrast.matrix <- makeContrasts(Normal-DiseasedvsNormal, levels=design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
summary(decideTests(fit2))
probeset.list <- topTable(fit2,coef=1, number=Inf, adjust.method ="BH", lfc=1, p.value = 0.05)
results <- decideTests(fit2, method="global")
vennDiagram(results, include=c("up","down"))
After running this, I am not getting any differentially expressed miRNAs. Can anyone please tell me whether this code is correct or having some error? I will fix it, but it would be very helpful if someone can help me regarding this. Thank you.
I got this after running
model.matrix
Cool, in this case, I would change this line:
to:
It looks like this will not make any difference to the results, though. It can be that there are genuinely no statistically significantly differentially expressed miRNAs
yeah I am getting no miRNAs.
then what should be the contrast matrix- normal-diseased?
Yes, probably then need:
I never use the
decideTests()
function.Directionality is important:
Disease - Normal
== Numerator, Disease; Denominator, NormalNormal - Disease
== Numerator, Normal; Denominator, DiseaseThank you so much kevin. It helped me a lot.
This code gave me all upregulated miRNAs. Why is it? It should give me some up and downregulated miRNAs. When I am using Disease - Normal, it is giving up and Normal - Disease, it is giving me down. How it can be rectified?
Perhaps it is the genuine result. Please show some of the output table, if you can?
There is no problem here. This is expected behaviour.
Disease - Normal
equals Disease divided by Normal (for fold change calculations)Normal - Disease
equals Normal divided by Disease (for fold change calculations)