Question: GOseq Probability Weight Function Interpretation and Use
0
gravatar for matt.a.bennett25890
3 months ago by
matt.a.bennett258900 wrote:

Hi,

Looking to get some feedback on a probability weight function (PWF) curve I've generated using GOseq. As I understand it, as long as the curves fit the data points reasonably then the downstream analysis will be ok however mine look a bit strange compared to examples I've seen elsewhere. This seems to be mainly due to a few extreme outliers and I'm wondering whether these will be affecting the results unduly and whether there is a way to account for these within the nullp() function. Also whether I've made some mistake in the gene length determination... Here is my process

A) Get length data for Hg38 from Ensembl and filter for only my expressed genes (filtered by all genes with count >1 in treatment of interest - too low?):

txdb <- makeTxDbFromEnsembl("Homo sapiens", release = 89)
txsByGene=transcriptsBy(txdb,"gene")
lengthData=median(width(txsByGene)
lengthfilt <- lengthData[names(lengthData) %in% names(gene_list)]

B) Use nullp() to generate a PWF for the expressed genes based on their respective lengths

pwf = nullp(clustVsAll, "hg38", "ensGene", bias.data = lengthfilt)

pwf curve

Any help/feedback much appreciated!

rna-seq goseq gene ontology • 136 views
ADD COMMENTlink modified 7 days ago by i.sudbery2.3k • written 3 months ago by matt.a.bennett258900
0
gravatar for i.sudbery
7 days ago by
i.sudbery2.3k
Sheffield, UK
i.sudbery2.3k wrote:

I think your problem is probably in this line here:

lengthData=median(width(txsByGene))

What you are doing here is getting the length of the primary transcript: i.e. the distance from the TSS to the TTS. Actually what you need is the sum of the exon lengths (i.e. excluding the introns). You can get this with the transcriptLengths() function.

ADD COMMENTlink written 7 days ago by i.sudbery2.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1165 users visited in the last hour