How to preserve the order of dataframe variables in R?
1
0
Entering edit mode
19 months ago
KABILAN ▴ 40

I have a dataset like below,

structure(list(PCV = c(0.0178219194071478, 0.0167224679086922, 0.0313796054695457, 0.0272633405874291, 0.00992979365423812, 0.0163545593623028, 0.0125615766079409, 0.0438832908556275, 0.0260965005930162, 0.034959332834335, 0.00651124339985815, 0.00773667420172548, 0.00460174240773309, 0.00940417833578374, 0.00763277410224326, 0.0569674690437892, 0.00554001729154236, 0.0102426634114334, 0.0191710901533892, 0.0127379038986653, 0.00859900586552533, 0.00630188507834846, 0.000184250143156493, 0.00651494443035729, 0.00477417309479366, 0.0298096494477779, 0.0235443699348768, 0.00846982190170002, 0.0197493082323879, 0.00885420900157687, 0.00771739026182587, 0.0227915291110601, 0.000326021119179784, 0.00347808426299245, 0.00244844394159794, 0.0221243684669031, 0.00853034943193308, 0.0117734523728633, 0.00438879865028313, 0.00162737834039006, 0.00102263562640706, 0.00256966419093599, 0.00819905987547494, 0.00356380381933028, 0.00459378907571579, 0.0123769394422116, 0.0162725362822941, 0.00770364870061668, 0.0184835516883016, 0.00798092837759707, 0.00574272817857334, 0.00483107847770393, 0.0017089616030636, 0.00334660568350707, 0.0114543838108249, 0.00288212452973156, 0.00448938651825993, 0.00593444755414696, 0.0103782620446864, 0.00424463992722479, 0.0161764747677885, 0.0105032486560586, 0.061974812175287, 0.00528277075107687, 0.000766055202087631, 0.0198394482053174, 0.00734319673771724, 0.00571223067545781, 0.0061683142070276, 0.00170204019314863, 0.00484076438978875, 0.00222693661639841, 0.0204057550556842, 0.00494096746578935, 0.00642331357982557, 0.000845046692055484, 0.0234690091797697, 0.00520249711980663, 0.0141779818674367, 0.0946105742913523, 0.00496222530713291, 0.066585835547389, 0.000763194722436555, 0.0588866152937399, 0.00300507357098326, 0.0662912715588685, 0.00358567303889042, 0.0017549310798091, 0.0222871772118731, 0.00708496651557248), Type = c("knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_vsn", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_loess", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "knn_rlr", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_vsn", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_loess", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "lls_rlr", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_vsn", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_loess", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr", "svd_rlr")), class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -90L))

I have to find the mean for each group of the data. I tried two different codes like,

data %>% group_by(Type)%>% summarise(mean_run = mean(PCV))

and result <- stats::aggregate(data$PCV, list(data$Type), mean)

Both the codes are giving the same kind of results. But the order of variables are changing automatically like below,

structure(list(Type = c("knn_loess", "knn_rlr", "knn_vsn", "lls_loess", "lls_rlr", "lls_vsn", "svd_loess", "svd_rlr", "svd_vsn"), mean_run = c(0.0140545756246163, 0.0116801617130501, 0.0236972387280275, 0.00827665570788852, 0.00550126183277225, 0.00852058159590288, 0.0177142846257907, 0.0235206963846695, 0.0135468591570967)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -9L))

But I need the order of variables like below,

structure(list(Type = c("knn_vsn", "knn_loess", "knn_rlr", "lls_vsn", "lls_loess", "lls_rlr", "svd_vsn", "svd_loess", "svd_rlr"), mean_run = c(0.0236972387280275, 0.0140545756246163, 0.0116801617130501, 0.00852058159590288, 0.00827665570788852, 0.00550126183277225, 0.0135468591570967, 0.0177142846257907, 0.0235206963846695)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -9L))

Kindly suggest some useful code for correcting this issue. Thank you in advance.

R data_variables_order data-frame • 482 views
ADD COMMENT
6
Entering edit mode
19 months ago
Basti ★ 2.0k

You could transform your Type variable to factor and set the levels you need :

data$Type=factor(data$Type,levels=c("knn_vsn", "knn_loess", "knn_rlr", "lls_vsn", "lls_loess", "lls_rlr", "svd_vsn", "svd_loess", "svd_rlr"))
ADD COMMENT

Login before adding your answer.

Traffic: 1850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6