Question: excluding a list of genes from an expression profile
0
gravatar for A
5.1 years ago by
A3.9k
A3.9k wrote:

sorry friends,

i have a microarray dataset, genes in row and samples in column. i have a list of genes that i need only this list among the dataset and i want to exclude another genes. in other work i just need the expression of genes in my list not all genes in the row. how i can exclude the rest of genes? first i thought about venn diagram to see the intersection and cut and past the intersect manually in excel but i think this would last long time

thank you

myposts R gene • 1.2k views
ADD COMMENTlink modified 5.1 years ago by vassialk190 • written 5.1 years ago by A3.9k
1

Consider this example

Your microarray dataset matrix => X

Genes   Sample-1       Sample-2    Sample-3
    A        0.2            0.3         0.4             
    B        0.9            2.0         0.9
    C        1.0            2.3           4
    D          2            2.8         3.2
    E        2.2            1.7         0.1

The list of genes you want to extract => Y

Genes
    C
    E
    A
XS1 <- match(Y[,1],X1[,1],nomatch=NA_integer_,incomparables=NULL)
your_genes <- X[XS1,]
write.table(your_genes, file = "final_list.xls", sep = "\t", col.names = TRUE, row.names = FALSE)

If you are working with R, I would suggest you to get perfect with R basics. Good luck.

ADD REPLYlink modified 12 months ago by _r_am30k • written 5.1 years ago by Jeevan20

sorry, I did like below but in column I don have the sample name anymore

> setwd("E:/Affy data Col-0 priming")
> RMA <- read.delim("E:/Affy data Col-0 priming/RMA.txt", header=FALSE)
>   View(RMA)
> mycounts <- read.table("RMA.txt", sep="\t", header=TRUE)
> excluding <- read.table("E:/Affy data Col-0 priming/excluding.txt", quote="\"", comment.char="")
>   View(excluding)
> S <- read.table("excluding.txt", sep="", header=F)
> XS1 <- match(excluding[,1],RMA1[,1],nomatch=NA_integer_,incomparables=NULL)
Error in match(excluding[, 1], RMA1[, 1], nomatch = NA_integer_, incomparables = NULL) : 
  object 'RMA1' not found
> XS1 <- match(excluding[,1],RMA[,1],nomatch=NA_integer_,incomparables=NULL)
> mycounts <- RMA[XS1,]
> write.table(mycounts, file = "final_list.txt", sep = "\t", col.names = TRUE, row.names = FALSE)
> write.table(mycounts, file = "final_list.txt", sep = "\t", col.names = TRUE, row.names = T)
> header(mycounts)
Error: could not find function "header"
> head(mycounts)
             V1               V2               V3               V4               V5
3425  AT1G53540 4.33805912088666 4.56717597785314 4.31699157444953 6.63733250254801
13839 AT4G10250 5.54088498858581 5.06260528716105 5.58895459608889 5.09960147344372
17241 AT5G12020 4.85075472868197 5.97981091681778 4.61311961971554 6.70084032086995
15287 AT4G27670 4.94783908331899 4.53323839927969 4.64028276820974  4.3887886168914
20530 AT5G59720 6.16724160637511 5.88516161089836 6.24121644210157 5.93223068617705
1497  AT1G18970 5.18955605652289 4.93183017186116 5.25335960179554 5.29563912307846
ADD REPLYlink modified 12 months ago by _r_am30k • written 5.1 years ago by A3.9k
1

It's because you have mentioned the header in RMA as FALSE. Check out this.

setwd("E:/Affy data Col-0 priming")
RMA <- read.delim("E:/Affy data Col-0 priming/RMA.txt", header=TRUE)
excluding <- read.table("E:/Affy data Col-0 priming/excluding.txt", quote="\"", comment.char="")

XS1 <- match(excluding[,1],RMA[,1],nomatch=NA_integer_,incomparables=NULL)
mycounts <- RMA[XS1,]
write.table(mycounts, file = "final_list.xls", sep = "\t", col.names = TRUE, row.names = FALSE)

I really suggest you to take an R course before proceeding with the analysis.

ADD REPLYlink modified 12 months ago by _r_am30k • written 5.1 years ago by Jeevan20

thank you,

your code worked

ADD REPLYlink written 5.1 years ago by A3.9k

Couldn't you take an introductory R course or something? You could learn how to index a matrix or properly, using e.g. match. No offense, but it looks like you are getting stuck all the time because you don't know basic stuff in R.

ADD REPLYlink written 5.1 years ago by Michael Dondrup48k

thank you Michael,

i am in max planck institute, colloides and interfaces, Berlin...since my arrival i searched a lot for R, bioinformatics, so on courses but nothing...i asked help from many people in bioinformatics department but they are too busy and rejected me by few words.. in biostar i always could solve my problem even after some arguments!

ADD REPLYlink written 5.1 years ago by A3.9k
1

Maybe you could join the Berlin R-users group http://www.meetup.com/Berlin-R-Users-Group/, they might know about more courses in the area.

Watch out also for courses like this:

http://www.r-bloggers.com/hands-on-computational-genomics-course-in-berlin/

It is over now, but there might be more courses like it in the future.

ADD REPLYlink written 5.1 years ago by Michael Dondrup48k

thank you, hope to learn R..

ADD REPLYlink written 5.1 years ago by A3.9k
0
gravatar for vassialk
5.1 years ago by
vassialk190
Belarus
vassialk190 wrote:

Use standard statistical software for the data sheet manipulation: StatsDirect, SPSS, STATA, Minitab, SAS, JMP Genomics (one of the best for your goals), STATISTICA, DeltaGraph. Some specialized software like Genesis, MeV, Mayday and Expander will be of value to you too. Surely R is a free language to torture yourself with data import and loosing your life on code writing, dreaming about IT industry billions....

ADD COMMENTlink written 5.1 years ago by vassialk190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 934 users visited in the last hour