Bioinformatics tutoring
0
0
Entering edit mode
26 days ago

Hi everyone,

I'm a new phd student and struggling to analyse microarray data in R.

If anyone could help it would be so appreciated.

Thankyou!

differentially genes expressed microarray • 742 views
0
Entering edit mode

Very open ended question I am afraid, what did you try? Did you read e.g. the limma user guide which is the standard package for differential analysis of arrays. Find it at https://bioconductor.org/packages/release/bioc/html/limma.html

0
Entering edit mode

Yes I have read through it but I only understand parts of it. e.g. So I can only filter out the lowly expressed genes and do the MDS plot showing distances between expression profiles

When I try to run some of the commands in R, it won't run.

0
Entering edit mode

I suggest to work your way through step by step, then if something fails: google the error message. If that doesn't lead you anywhere, post a slightly more specific question, giving the exact commands and error messages you tried.

More immediate help might come from local R user groups or bioinformatics chat groups.

If you are working mainly with Bioconductor packages, then https://support.bioconductor.org/ is better.

0
Entering edit mode

Thankyou for your help. I will try to post a more specific question.

I've been working through the workflow below:

https://combine-australia.github.io/RNAseq-R/06-rnaseq-day1.html#References

Do you happen to know what this command means ann <- select(org.Mm.eg.db,keys=rownames(results.ordered),columns=c("ENTREZID","SYMBOL","GENENAME"))

Its come up as an error

Error in .testForValidKeys(x, keys, keytype, fks) : None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.

Thankyou

0
Entering edit mode

Again, very open-ended statement. "it won't run", what does that mean? Here is an end-to-end workflow for Affy arrays, maybe that helps: https://www.bioconductor.org/packages/release/workflows/vignettes/maEndToEnd/inst/doc/MA-Workflow.html

If you are new and want to learn yourself then it takes time and effort, but it is doable. Many people here have no formal bioinformatics background incl myself. It takes time and dedication, there are so many resources on the internet.

0
Entering edit mode

I meant that when I run many of the commands in R it comes up as an error. Sorry I'm very new to this, I have spent four days trying to work on it with no luck.

0
Entering edit mode

It may help to describe in detail the data that you have (file extension; source), to show the R commands that you have used, and also to show the error messages. Otherwise, how can anybody help you?

0
Entering edit mode

It is quite challenging because I can't upload the data on this forum. If anyone has an email address and could help me through that? I have microarray data on mesenchymal stem cells that are differentiating into chondrocytes given to me in excel. I have the treatment conditions day 0, day 7, day 14, and day 21. And there are 3 replicates for each condition. There are 32, 407 probenames/cells. I want to determine the top 20 genes in MSC differentiation and which genes are differently expressed across the different conditions.

What would be the best way to start. I have imported the excel spreadsheet and used the following commands:

library(edgeR)
library(limma)
library(Glimma)
library(org.Mm.eg.db)
library(gplots)
library(RColorBrewer)
library(NMF)

seqdata <- MSCs                   MSCs is the dataset
dim(seqdata)

countdata <- seqdata[,-c(1,14)]
rownames(countdata) <- genes
rownames(countdata) <- seqdata[,1]

y <- DGEList(countdata)
y
y$samples group <-c( "Day 14", "Day 14","Day 14", "Day 7", "Day 7", "Day 7", "Day 21", "Day 21", "Day 21","Day 0", "Day 0", "Day 0") group y$samples$group <- group y$samples

myCPM <- cpm(countdata)
thresh <- myCPM > 0.5
table(rowSums(thresh))
keep <- rowSums(thresh) >= 2
summary(keep)

plot(myCPM[,100],countdata[,100])
plot(myCPM[,1],countdata[,1],ylim=c(0,50),xlim=c(0,3))
(the plot didn't work and came up as an error)

Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'plot': subscript out of bounds

group
design <- model.matrix(~ 0 + group)
design

colnames(design) <- levels(group)
design

par(mfrow=c(1,1))
v <- voom(y,design,plot = TRUE)
v

fit <- lmFit(v)
names(fit)

cont.matrix <- makeContrasts(day0Vsday21=Day 0- Day 21,levels=design)
(came up as error)
Error: unexpected numeric constant in "cont.matrix <- makeContrasts(day0V
sday21=Day 0"
0
Entering edit mode

Ok, that is not a big deal, you cannot have unquotedd spaces in variable or factor names, try:

cont.matrix <- makeContrasts(day0Vsday21=Day 0 - Day 21, levels=design)

and see what happens.

For getting better advice, maybe you could use built-in data, most packages come with that. People might not want to download random excel files due to malware concerns but you could share text files via github, or use public files on e.g. google drive. But first, try to work with example data.

0
Entering edit mode

It still came up as an Error in makeContrasts(day0Vsday21 = Day 14 - Day 21, levels = design) : The levels must by syntactically valid names in R, see help(make.names).

Someone sent me a RNA seq workflow and I've been using it on microarray data. I'll have to start again.

0
Entering edit mode

Just to finalize this, I think you need makeContrasts(day0Vsday21=Day 0 - Day 21, levels=colnames(design))

But as you are using the wrong workflow for the data, it doesn't really matter.

0
Entering edit mode

Btw, this is RNA-seq data, not microarray.

0
Entering edit mode

Do you know what the first steps of a microarray workflow is if I'm using an excel spreadsheet that has 14 columns. ProbeName, Day 14-1, Day 14-2, Day 14-3, Day 7-1, Day 7-2, Day 7-3, Day 21-1, Day 21-2, Day 21-3, Day 0-1, Day 0-2, Day 0-3, GeneSymbol Thankyou

0
Entering edit mode

This really pretty much depends on a lot of things, the only thing I know now is that you have a sort of time-series, but it also depends on the platform and normalization that was applied. Likely you can still use limma (without voom), and try to work your way through the limma user guide. That is the only thing I can say without having the data and provenance information.