goseq for non-native species infinite recursion
1
0
Entering edit mode
4.8 years ago
nsl24 • 0

I'm trying to used the results of a differential expression analysis to look for enriched genes using goseq but I'm having a beast of a time even getting a trial for my non-native species working.

I have:

• downloaded gene lengths as a numeric vector taken from biomart (Length)
• A gene.vector created from all of the surveyed genes with 1 or 0 depending on DE (from my output file named DE)
• A dataframe containing gene ids and the associated GO terms taken from biomart (Named GOT)

My test code:

assayed.genes=DE$assayed.genes de.genes=DE$de.genes
gene.vector=as.integer(assayed.genes%in%de.genes)
names(gene.vector)=assayed.genes
Length = LEN$genelength head(gene.vector)  and I see output like Cre09.g414550.t1.2.v5.5 0 When I try to make the pwf and run goseq pwf = nullp(gene.vector, bias.data=Length) go = goseq(pwf, gene2cat = GOT)  The pwf works and produces a plot but when I run goseq I get hit with an infinite recursion error: Error: evaluation nested too deeply: infinite recursion / options(expressions=)? Followed by "Error during wrapup:" repeated Tweaks and googling haven't turned anything up, so I was hoping someone might be able to spot a glaring error in my approach or offer advice. RNA-Seq goseq software error • 1.3k views ADD COMMENT 0 Entering edit mode 4.2 years ago Ruben ▴ 30 Hi nsl24, I know you probably have moved on but I had the identical problem and your question was the only one that popped up in my search. So for people in the future struggling with this, here is how I solved this issues for my code. The solution for me was very simple. I was working with a tibble and forgot about that (from the tibble package). I could either convert it to a named list or to coerce the tibble into a data frame. In either case, probably some genes map to many GO terms and others do not map to anything. So you should have a named list with many duplicate names pointing at various GO terms or a number of duplicate row values next to all their go terms in the other column. This is also pointed out in the package documentation, just something to keep in mind. Your data frame GOT that you used in your example could be altered as follows: NamedList = GOT$GOterms
names(NamedList) = GOT\$GeneIDs
go = goseq(pwf, gene2cat = NamedList)


Cheers, Ruben