subsetting a data frame that has rows with values in more than one column in R
1
0
Entering edit mode
16 months ago
pramach1 ▴ 40

I have a large data frame that as several rows and columns. I need to subset the data frame to rows that has values in more than column. This is the data frame.

  1. Sample Typhi Kentucky | 8:i:z6
  2. F675BNARV 0 2000(3%)
  3. F685NARV 0 0
  4. F722NARV 2038340 (9.24%) 2882679 (13.07%)

I want to subset the row number 4 (F722NARV), since it has values in more than one column. How do i do that. I have tried various forms of subset and sapply. Any help regarding this is appreciated.

dataframe R subsets • 740 views
ADD COMMENT
3
Entering edit mode
16 months ago

It's unclear how your data.frame is formatted exactly (you can share part of it via dput(head(df))), but generally speaking the code will look something like this.

library("dplyr")

# If you want all columns except the first to not equal 0.
df |>
  rowwise() |>
  filter(if_all(!1, \(x) x != 0)) |>
  ungroup()

# If you just want more than one column (except the first) to not equal 0.
df |>
  rowwise() |>
  filter(sum(c_across(!1) != 0) > 1) |>
  ungroup()

Again, this code may not work for you depending on how your data.frame is actually formatted, so edit the code as needed or include a reproducible example.

ADD COMMENT
1
Entering edit mode
# If you just want more than one column to not equal 0.
df |>  rowwise() |> filter(sum(c_across(!1) != 0) > 1) |> ungroup()

Worked for what I was looking for. Thank you.

ADD REPLY
0
Entering edit mode

Thank you. Will try this. In the meantime here is the dput(head(df))

> dput(head(df2))
structure(list(Sample = c("F675BNARV", "F685NARV", "F715NARV", 
"F717NARV", "F722NARV", "F762NARV"), I.48.z4.z24.1.5...48.z4.z24.1.5 = c("0", 
"0", "0", "0", "2038340 (9.24%)", "0"), Kentucky...8.i.z6 = c("0", 
"0", "0", "0", "2882679 (13.07%)", "0"), Molade.or.Wippra...8.z10.z6 = c("831691 (3.69%)", 
"0", "0", "0", "0", "0"), Montevideo...7.g.m.s.NA = c("0", "530046 (2.21%)", 
"7823859 (39.08%)", "0", "0", "0"), Newport...8.e.h.1.2 = c("0", 
"0", "0", "6689807 (22.29%)", "0", "2864791 (9.66%)")), row.names = c(NA, 
6L), class = "data.frame")
ADD REPLY

Login before adding your answer.

Traffic: 1892 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6