Question: How to determine low and high variable row and column in a table?
0
gravatar for star
10 months ago by
star240
Netherlands
star240 wrote:

I have a big table, its rows are genomic coordinates and columns are the genomic features (like below). I would like to separate rows and columns based on the variability, I have tried to use some basic statistics like below codes, but I like to know is it the right way or is there an alternative (statistical) way that would be more accurate?

DF:

          Feature_A     Feature_B    Feature_C    Feature_D

cord_1         0.9              1           0.8           1  
cord_2         0.6              0.1         0.9         0.5
cord_3           0              0             0           0
cord_4         0.1              0             0         0.2

codes:

DF$skew<-rowSkewness(DF)
DF$var <-rowVars(DF)
DF$sd <-rowSds(DF)
DF$IQR <- rowIQRs(DF))
DF$mean <- rowMeans(DF)
DF$coef.var <- DF$sd /DF$mean

I would like to consider cord_2 (as more variable) and ignore cord_1,3 and 4 in my output, so based on that, which statistic element is more better?

ADD COMMENTlink modified 9 months ago by Biostar ♦♦ 20 • written 10 months ago by star240

Use IQR!

ADD REPLYlink written 10 months ago by German.M.Demidov1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1435 users visited in the last hour