I need to transform tab-delimited file like that:
IPR018351 GRMZM2G458776
IPR005731 GRMZM2G047513
IPR005732 GRMZM2G087165 GRMZM2G146818 GRMZM2G427404
IPR018355 GRMZM2G082642 GRMZM2G310283 GRMZM2G406977 GRMZM5G886785
to list of vectors in R or MgsaSets object from mgsa R package
Here's what I have tried.
putative solution 1.
- Read my file to R
x=read.table("../tymczasowe/x",sep="\t",row.names=1,fill=T) - Transform it to list of vectors
x_list=split(x,row(x))
I must say that my longest line is 1616 field long, so I moved it to first line of my orginal file to make read.table read it correctly. split commands caused termination of R. I've tried this procedure on much smaller example and it worked ok.
I've tried also to transform my data.frame to MgsaSets object: annoIP=new("MgsaSets",sets=as.data.frame(t(x)))
command looked successful, but produced one entry more than I expected (one gene more), but I don't know how this additional entry looks (I'm not very advanced in S4 objects).
I tried to perform analysis: xwyn=mgsa(xprb,annoIP) (xprb is just a list of genes to analysis) and I got this error Error in mgsa.trampoline(o, sets[!isempty], n, alpha = alpha, beta = beta, :
Set index to high (must not exceed 'n')
putative solution 2.
I tried to read the file to a MgsaSets object for mgsa package, I tried to create from it appropriate code and paste it to command-line. Problem here is that code like works for small files x=new("MgsaSets",sets=list(IPR001844=c("AC215201.3_FG005","GRMZM2G009871","GRMZM2G015989"),IPR005732=c("GRMZM2G087165","GRMZM2G146818","GRMZM2G427404")...,IPR018816=c("GRMZM2G072156","GRMZM2G566688")))
but doesn't work for my big file - it is probably too big/long. I got error messages Error: unexpected ',' in "," after every transition to next line of my pasted code e.g.,IPR023193=c("GRMZM5G877500")
Now I really don't have any idea how I can create desired file.
Thanks, the code did the work :)