Question

Accurate estimation of variance of ratios of microarray probe signals

0

Entering edit mode

11.1 years ago

C0mmanderKeen • 0

I am analyzing microarray data using the Bioconductor limma package in R. The experiment is a direct two-color experiment comparing two genotypes. There are two probes for each gene and I would like to analyze the ratio of the probe signals. I am using the standard workflow of first fitting a linear model and then applying the Bayesian hierarchical modeling:

fit <- lmFit(normalizedValues, designMatrix)
fit <- eBayes(fit)
topTable(fit)

So far, I have calculated the ratios of the two probes from the raw data for each replicate and then fed it into the analysis pipeline. However, this does not reflect the variance associated with the raw values that were used to calculate the ratio. My guess is that I am underestimating the variance. Is there a way to accurately perform such an analysis using the limma package? I can calculate an estimate of the variance using Fieller's theorem but I don't know how to feed that into the pipeline without seriously tinkering with the limma functions.

Any suggestions?

Thanks for your help!

microarray limma • 2.2k views

ADD COMMENT • link updated 3.8 years ago by Ram 45k • written 11.1 years ago by C0mmanderKeen • 0

0

Entering edit mode

Are these duplicate probes (also called "duplicate spots") or do they have completely different underlying sequence? In the latter case, I'd be a bit hesitant to try to combine probes into a single variance estimation, since they're often not actually measuring the same thing (be that due to differences in isoform targeting, differences is cross-hybridization, etc.).

ADD REPLY • link 11.1 years ago by Devon Ryan 105k

0

Entering edit mode

The probes measure different things, in my case one is an intron probe the other a splice junction probe. Good point on the different properties of the two probes and the potential problems with calculating a single variance estimate. Are there any alternatives if I want to make use of both sources of information (intron & junction)? Thanks!

ADD REPLY • link updated 5.5 years ago by Ram 45k • written 11.1 years ago by C0mmanderKeen • 0

0

Entering edit mode

At least to my knowledge there's nothing premade for this sort of task, though I've never really looked. These are somewhat strange probes to try to integrate. I expect that the intronic probes will generally show higher variance, since they're picking up unprocessed mRNA, lariats, circular RNAs, and genomic contamination, all of which are likely more variable than the mature mRNA picked up by the junction probes. I would have thought that one would use an array with this sort of design to try to get at changes in RNA processing due to some sort of treatment, though perhaps you have other goals.

ADD REPLY • link 11.1 years ago by Devon Ryan 105k

0

Entering edit mode

I am using this array to analyze difference in RNA splicing between genotypes. My thinking was that if splicing is compromised, the intron signal would go up (more pre-mRNA) and the junction signal would go down (less mature mRNA). This is where the ratio comes from (or in log space, the difference). I could score all splicing events that individually show significantly increased intron probe signals and significantly decreased junction probe signals.

However, I noticed that there are splicing events that have highly significantly increased intron probe signals but unchanged junction probe signals, so I would have to throw them out. This is why I am looking for a score that integrates the two pieces of information.

Thanks for your help and comments! Greatly appreciated!

ADD REPLY • link 11.1 years ago by C0mmanderKeen • 0

0

Entering edit mode

It might make sense to just split the the probe types as if they were replicates and then incorporate probe-type into your model matrix with a genotype interaction (so ~genotype*probe_type). That would allow you to find inconsistent changes in the probes.

ADD REPLY • link 11.1 years ago by Devon Ryan 105k