Question: Differential expression of microarray data in R
0

Hey

I need to do Differential expression analysis of microarry data I am stuck at one point. What I need to do is extract Up-regulated and down-regulated miRNA's from my data frame. I have data frame with 5 Samples A,B,C,D,E. A is parent (reference)sample and rest of samples are from patients. each row represents a miRNA and value against that row in each column represents Back ground subtraction values of that miRNA in each sample. so on the basis of this I want to extract miRNA's which are up regulated and down-regulated in each sample. I can do pairwise comparison like I can compare A&B, A&C A&D A&E or I can compare all sample at the same time like Name A B C D E.

My Data frame look like

A                B               C                D               E

miRNA-168                                              20.44          60.55 2.       11 5.      77 300          2.07.

miRNA-170,miRNA-70                               13.5558       58.558         3.857      8.748          89.97465

and so on

R • 2.2k views
modified 5.6 years ago by NancyFisherHansen110 • written 5.6 years ago by adnanjaved198860

I think that you should divide each value by the mean of its row and then draw the heatmap, it would be visible in one picture that which ones are contrasting

Yes I did this before but It can only give global picture

but my supervisor want me to extract all miRNAs which are up regulated are down regulated

when I will compare Parent sample A with other samples (B,C,D,E)

Best

yes, in the heatmap, you would see that what are the rows where (B,C,D,E) is in contrast with A

1
Sean Davis26k wrote:

Since you have no replicates, there really aren't any statistical tests that make sense.  So, I'd suggest simply dividing B, C, D, and E by A.  This gives the fold change for each sample with respect to sample A, the parent.  You can then filter rows as you see fit by fold change (where UP will be >1 and DOWN will be less than 1).

1
NancyFisherHansen110 wrote:

The main thing to realize here is that because you've only generated one set of miRNA counts per sample, you have no way to determine what a statistically significant result would look like.  A p-value is a measure of how unlikely an miRNA count is, given an assumed null distribution of the counts for each miRNA (without up or down-regulation).  You don't have enough data to make even the simplest guess at what your null distribution looks like, so you don't have enough data to calculate a p-value.

As Sean suggests, running replicates would enable you to do what you want to do!

--Nancy

Yes Agreed with both of you people.. (y)

One problem which I want to also mention  here

my data frame has some NA values in some columns I cant remove them because then I can't predict that that miRNAs was down regulated in any sample

e.g

A                    B                C               D               E

hsa-miR-614                       29062.144            NA              35251.71 / 29010.870 / 15940.595

this means that 'hsa-miR-614"

down regulated in sample B but then again show expression in other samples.

So question is if I have to divide them as Sean suggested my any value which is divided by NA would be NA

if I would replace NA----> with 0

29062.144     /      0       /       35251.71    /   29010.870     /    15940.595

Please suggest me any way by which I can do that operation which Sean suggested me to do for Fold change.

Best

I would suggest simply ignoring those NA measurements/fold changes.

0

When you have thousands of rows and rows names are miRNAs

you are not able to identify. Its about with P value comparisons and by running T-tests

but I am not able to do that till now I jist want someone who can tell me how I can do that

0

Hey Sean Davis that seems good idea

what I have to do is first I will divide B/C/D/E and then the product of the result would be divided by A or A would be divided by the product

I am asking so because then result would be changed?

for example

A                    B                C               D               E

29062.144            33955.38 / 35251.71 / 29010.870 / 15940.595

result=2.082875e-09

now A / result =29062.144 / 2.082875e-09 =1.39529e+13

Or Result/A

2.082875e-09 /29062.144 = 7.16697e-14

Best

I have not seen a technique that multiplies gene expression measures.  What would that represent?

What I am suggesting is B/A, which would be the fold change of B with respect to A.  Likewise, C/A, D/A, and E/A.  If this math and biologic interpretation do not make sense, I would suggest sitting down with your supervisor to discuss.