Question: Breast cancer TCGA data - DGE analysis
1
gravatar for David_emir
4.0 years ago by
David_emir330
India
David_emir330 wrote:

Hello All,

I am applying Voom normalization to RNA-Seq raw Counts data obtained from TCGA. I have constructed a Matrix of ~20000 Rows and 341 Coloumns with first column being of Gene_id.

I am using Voom() method to normalise the data. I have done the following code.

## Librarys
library(limma)
library(edgeR)

## Matrix File
raw.data <-read.delim("Combined_matrix_340.txt")
attach(raw.data)
names(raw.data)
d <- raw.data[, 2:341]
rownames(d) <- raw.data[, 1]

# Pheno data file
pheno<-read.table("pheno_data_BRCA.txt", header=TRUE, sep="\t")

##To design matrix---
Group<-factor(pheno$Status,levels=levels(pheno$Status))
design<-model.matrix(~0+Group)

##Normalisation 
y <- voom(d,design,plot=TRUE)

colnames(design)

fit <-lmFit(y,design)

##Designing Contrast Matrix for group Differentiation

cont.wt<-makeContrasts("Metastatic-Normal_Control","ERPositive-Normal_Control","PRPositive-Normal_Control","HER2Positive-Normal_Control","ER_PR_HER2_Neg-Normal_Control",levels=design)

fit2 <-contrasts.fit(fit,cont.wt)
fit3<-eBayes(fit2)

DE<-topTable(fit3, coef=2 )

After This, The out put is as follows:

 Gene_ID        logFC       AveExpr        t              P.Value            adj.P.Val        B
ACTB|60       12.59366   12.54151     202.8138  0.000000e+00  0.000000e+00 806.7855
EEF1A1|1915   12.06986 12.51399 187.5779  0.000000e+00  0.000000e+00 781.7838
ACTG1|71      11.93940 12.03115 179.5847  0.000000e+00  0.000000e+00 767.7521
UBC|7316      10.71139 11.15274 176.8877  0.000000e+00  0.000000e+00 761.7751
TPT1|7178     10.99882 11.58788 159.5321  0.000000e+00  0.000000e+00 728.9007
HSP90AB1|3326 11.00446 11.12925 157.1734 9.881313e-323 3.381237e-319 724.0502
FTH1|2495     10.98239 11.26717 153.0514 8.557019e-319 2.509774e-315 715.3888
EEF2|1938     10.82150 11.46502 151.3403 3.886332e-317 9.973786e-314 711.5412
PSAP|5660     10.71044 11.06326 147.8964 9.572234e-314 2.183639e-310 703.8942
HSP90AA1|3320 10.74747 10.94257 144.5401 2.294330e-310 4.710489e-307 696.4700

My Question: I am Getting only a list of 10 genes, i am not able to pull all list. And, I want someone to validate my codes and method followed. Let me remind you all, i am a novice in coding/Bioinformatics. Please let me know if i am coding it correct or should i modify it.

Thanks a lot for your help.

-Ateeq Khaliq


 

 

voom rna-seq tcga raw_count • 1.5k views
ADD COMMENTlink written 4.0 years ago by David_emir330
1
DE = topTable(fit3, coef = 2, number = 'all')

gives all genes. Default topTable outputs only top ten genes.

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by poisonAlien2.8k

Thanks a lot poisonAlien.... Can you please Validate my code ? 

ADD REPLYlink written 4.0 years ago by David_emir330

David, Could you please help me and tell me how did you construct the matrix?

ADD REPLYlink written 3.1 years ago by hAjmal30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 925 users visited in the last hour