Reference

Question

Tutorial:Rembrandt Glioma Data Analysis (PART I) - Are Gender specific genes related to cancer ?

2

Entering edit mode

7.8 years ago

majuang66 ▴ 140

I post "Tutorial: Importance of Array Quality Control - arrayQualityMetrics (PART I), Importance of Array Quality Control - arrayQualityMetrics (PART I)". I have analyzed the Rembrandt Data (Brain tumor), to date, to find new insight of Glioma. This post is a part of my analysis related to Brain tumor. In conclusion, this result indicated that in clustering analysis Rembrandt data showed gender-specific gene expression pattern using 43 genes through unspecific gene filtering.

# Access linux server
# access the folder saved Rembrandt Glioma Data
$ cd Rembrandt_Glioma - 580 microarrays consist of astrocytoma, oligodendroglioma, normal, GBM, un-known
$ R # to access to R program

# Rembrandt Data Import into R
library (affy)

mydata <-ReadAffy()
# Multiple-array Normalization
mydata_rma<-rma(mydata)

# Array Quality Control through arrayQualityMetrics
library(arrayQualityMetrics)
# arrayQualityMetrics of mydata
arrayQualityMetrics(expressionset=mydata,outdir="Report_for_Rembrandt_RMA",force=TRUE,do.logtransform=TRUE)
# arrayQualityMetrics of mydata_rma
arrayQualityMetrics(expressionset=mydata_rma,outdir="Report_for_nRembrandt_RMA",force=TRUE)

write.table(mydata_rma,file="Rembrandt_RMA_QC.txt",sep="\t", quote=FALSE, row.names=TRUE, col.names=TRUE)

I removed outlier 31 of 580 samples through arrayQualityMetrics packages in Excel program. After edit of the file, which is saved as tab-deliminated file.

# Next, I filtered genes using genefiltering and saved at local computer.
mydata<-read.table(file="Rembrandt_RMA_QC.txt",sep="\t", row.names=1,header=T)

# Genefiltering using standard deviation
library(genefilter)

rsd <- rowSds(mydata) # Standard Deviation for row (features) more than 2

i<-rsd>=2

mydata_filtered<-mydata[i,] # 43 genes were selected

write.table(mydata_filtered,file="Rembrandt_RMA_QC_filtered.txt",sep="\t", quote=FALSE, row.names=TRUE, col.names=TRUE)

# Next, I performed the clustering tendency assessment of the above dataset (The clustering tendency assessment determines whether a given dataset contains meaningful clusters(1)).

install.packages ("clustertend")

library(clustertend)

set.seed(12345)

hopkins(mydata_filtered, n=nrow(mydata_filtered)-1,byrow=T, header=T) # mydata_filtered: variable is samples and object is genes

$H value : 0.2712307 (If the value of Hopkins statistic is close to zero, then we can reject the null hypothesis and conclude that the dataset D is significantly a clusterable data (1))

mydata_filtered_1<-t(mydata_filtered)
hopkins(mydata_filtered_1, n=nrow(mydata_filtered_1)-1,byrow=T, header=T) # mydata_filtered_1: variable is genes and object is samples

$H value : 0.288575 (If the value of Hopkins statistic is close to zero, then we can reject the null hypothesis and conclude that the dataset D is significantly a clusterable data (1))

Reference

(1) Accessing Cluster Tendency: A vital issue - Unsupervised Machine Learning (http://www.sthda.com)

gene R Rembrandt • 3.1k views

ADD COMMENT • link updated 12 months ago by Ram 43k • written 7.8 years ago by majuang66 ▴ 140

0

Entering edit mode

Where's the bit about allosomes? And why hand-filter in Excel, this is quite the opposite of good practise?

ADD REPLY • link 7.8 years ago by russhh 5.7k

0

Entering edit mode

Your comment is good. Removal of samples could be performed in R using target file. Actually, this method is easy and good. In case of small sample, I use Excel program. But, I will post the practise of R for sample removal than Excel.

ADD REPLY • link 7.8 years ago by majuang66 ▴ 140

0

Entering edit mode

I've put the code in code blocks, though this could use a bit more tidying up.

ADD REPLY • link 7.8 years ago by Devon Ryan 104k

0

Entering edit mode

Should not this be be changed to the one I put. Since it is the array QC of the normalized data In the code

 # arrayQualityMetrics of mydata_rma
    arrayQualityMetrics(expressionset=mydata,outdir="Report_for_nRembrandt_RMA",force=TRUE)

Corrected to

 # arrayQualityMetrics of mydata_rma
    arrayQualityMetrics(expressionset= mydata_rma,outdir="Report_for_nRembrandt_RMA",force=TRUE)

ADD REPLY • link 7.8 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Thank you^_^!! Your comment is correct.

ADD REPLY • link 7.8 years ago by majuang66 ▴ 140

score 0 · Answer 1 · 2017-05-18

0

Entering edit mode

6.9 years ago

wq06100 • 0

Hi, I wonder know how to download Rembrandt Glioma Data? I searched "https://caintegrator.nci.nih.gov/rembrandt/login.do", but I still can not find. I hope you can help me, thank you very much.

ADD COMMENT • link 6.9 years ago by wq06100 • 0