Question: How to determine low and high variable row and column in a table?
0
gravatar for star
13 days ago by
star190
Netherlands
star190 wrote:

I have a big table, its rows are genomic coordinates and columns are the genomic features (like below). I would like to separate rows and columns based on the variability, I have tried to use some basic statistics like below codes, but I like to know is it the right way or is there an alternative (statistical) way that would be more accurate?

DF:

          Feature_A     Feature_B    Feature_C    Feature_D

cord_1         0.9              1           0.8           1  
cord_2         0.6              0.1         0.9         0.5
cord_3           0              0             0           0
cord_4         0.1              0             0         0.2

codes:

DF$skew<-rowSkewness(DF)
DF$var <-rowVars(DF)
DF$sd <-rowSds(DF)
DF$IQR <- rowIQRs(DF))
DF$mean <- rowMeans(DF)
DF$coef.var <- DF$sd /DF$mean

I would like to consider cord_2 (as more variable) and ignore cord_1,3 and 4 in my output, so based on that, which statistic element is more better?

ADD COMMENTlink modified 12 days ago • written 13 days ago by star190

Use IQR!

ADD REPLYlink written 13 days ago by kuckunniwid330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1342 users visited in the last hour