Question: Dataset normalization before gene ontology analysis
1
gravatar for tiago211287
4.7 years ago by
tiago2112871.2k
USA
tiago2112871.2k wrote:

I peformed GO analysis from a list of Genes using the Goseq package from bioconductor. After plotting the results, I could see that the bigger the gene list was, most counts it had from each category, there are some way for normalize this by the size of each gene list?

ADD COMMENTlink modified 4.7 years ago by svlachavas670 • written 4.7 years ago by tiago2112871.2k
0
gravatar for svlachavas
4.7 years ago by
svlachavas670
Greece
svlachavas670 wrote:

Dear Tiago211287,

I believe that you get this result from ploting, because generally in RNA-seq the length of one gene is crusial regarding the levels of its expression (which in turn is associated with power). Thus, one way to possibly adjust for this when performing a GO analysis with RNA-seq data, is to use prior the function ?nullp:

nullp(DEgenes, genome, id, bias.data=NULL,plot.fit=TRUE)

This will produce a set of relative weights which are "somehow proportional" to how "big" are your input genes.

Then, you can feed it directly to goseq()

Hope that helps,
Efstathios

ADD COMMENTlink modified 6 months ago by RamRS27k • written 4.7 years ago by svlachavas670

I did that in goseq, generating a pwf(Probability Weighting Function)

ADD REPLYlink written 4.7 years ago by tiago2112871.2k

Well then, excuse me but I misunderstood your question. So, did you meant that you used more than one gene lists ? If so, (without being an expert on RNA-seq analysis) why do you want to normalize for the size of each list?

ADD REPLYlink modified 6 months ago by RamRS27k • written 4.7 years ago by svlachavas670
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 851 users visited in the last hour