Entering edit mode
                    6.1 years ago
        star
        
    
        ▴
    
    350
    I have a big table, its rows are genomic coordinates and columns are the genomic features (like below). I would like to separate rows and columns based on the variability, I have tried to use some basic statistics like below codes, but I like to know is it the right way or is there an alternative (statistical) way that would be more accurate?
DF:
          Feature_A     Feature_B    Feature_C    Feature_D
cord_1         0.9              1           0.8           1  
cord_2         0.6              0.1         0.9         0.5
cord_3           0              0             0           0
cord_4         0.1              0             0         0.2
codes:
DF$skew<-rowSkewness(DF)
DF$var <-rowVars(DF)
DF$sd <-rowSds(DF)
DF$IQR <- rowIQRs(DF))
DF$mean <- rowMeans(DF)
DF$coef.var <- DF$sd /DF$mean
I would like to consider cord_2 (as more variable) and ignore cord_1,3 and 4 in my output, so based on that, which statistic element is more better?
Use IQR!