Exclude Lines In Based In Multiple Strings In R
2
0
10.4 years ago
viniciushs88 ▴ 50

I would like to exclude entire lines in matrix "Y" based in 2 collumns in matrix "X":

Matrix "X":

number1  number2  inf
gen1     genx1   223
gen1     genx2   221
gen2     genx3   224
gen2     genx5   225


Matrix "Y":

numberall  inf
gen1      223
genx1     256
gen2      225
genx2     214
gen3      563
genx3     235
gen4      256
genx4     568


Expected output matrix

numberall  inf
gen3      563
gen4      256
genx4     568


Cheers.

r • 2.0k views
1
10.4 years ago

I believe something like this will work:

#make a list of all terms you want to exclude
exclude = c(x[,1],x[,2])
#keep only those that aren't in your list
new.y = y[!(y[,1] %in% exclude),]

3
%in% also works with a matrix, so you don't need exclude = c(x[,1],x[,2]):

> mat = matrix(c(letters[1:8]),nrow=4,ncol=2)
> letters[!letters %in% mat]
[1] "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"

0
That's good to know!

1
10.4 years ago
EXCLUDE <- union(which(Y[,1] %in% X[,1]), which(Y[,1] %in% X[,2]))
Y[-EXCLUDE,]


Edit: Chris beat me to it with a similar method.

Edit2: Apparently one can also simply:

Y[-which(Y[,1] %in% X),]


which requires less typing.