TSV fie to dataframe R for Anova
0
0
Entering edit mode
10 months ago
kbaitsi • 0

I have a tsv file with 61 columns and 18703 lines (genes). Ι want to convert it in a appropriate dataframe in order to perform an Anova Analysis. The tsv file contains of 6 conditions (WT, TG, A, B, C, D). I have written the following code

f<-read.table(file = "GeneExpressionDataset_normalized.tsv", sep="\t", header=TRUE) 
data.frame(Expression=as.numeric(f[1,2:61]), Condition = c(rep("WT", 10), rep("TG", 10), rep("A", 10), rep("B",10), rep("C", 10), rep("D", 10)))

for the first line but I am not sure how to loop this in order to get a dataframe for all the lines.

I tried

ff<-sapply(1:nrow(f),function(i){
  x<-as.numeric(f[i,2:61])
  data.frame(Expression=x, Condition = c(rep("WT", 10), rep("TG", 10), rep("A", 10), rep("B",10), rep("C", 10), rep("D", 10)))  
})

and

a <- for (i in 1:nrow(f)){
  data.frame(Expression=as.numeric(f[i,2:61]), Condition = c(rep("WT", 10), rep("TG", 10), rep("A", 10), rep("B",10), rep("C", 10), rep("D", 10)))
}

but it's not working. Any suggestions?

tsv dataframe r anova loop • 321 views
ADD COMMENT
0
Entering edit mode

What is the final goal? Differential expression?

ADD REPLY
0
Entering edit mode

Yes, that's right...

ADD REPLY
1
Entering edit mode

Then why not using established, well-tested and specialised software such as limma. Please go through its very extensive vignette. Other options for DE can be DESeq2 or edgeR but these strictly require the raw counts, you seem to have normalized counts, therefore limma-trend pipeline could be an option.

ADD REPLY

Login before adding your answer.

Traffic: 1583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6