R: remove rows by a list of rownames
3
0
Entering edit mode
9.8 years ago
silvi_free88 ▴ 50

Dear all, I am having a bit of trouble trying to remove rows in a table by its rowname. I would be very graceful if you could check if i am doing something wrong. Here is what i'm doing:

> data = read.table("path/to/my/file/file", header=T, row.names=1, com='')
> head(data)
                    apo1_LO apo2_LO apo3_LO apo4_LO lux1_LO lux2_LO lux3_LO
comp230955_c0_seq10    0.00    8.05    8.15   32.91    0.00   38.41    0.00
comp135535_c0_seq1     1.00    0.00    0.00    0.00    0.00    0.00    0.00
comp222610_c5_seq4     4.96    8.21    1.84    5.57   14.59    6.54    6.51
comp227842_c8_seq3    93.53  131.08   79.54  198.85  166.32  108.80   43.79
comp230019_c3_seq1   141.97  355.04  142.11   71.02  497.84  424.01  212.69
comp215198_c1_seq3    18.27   28.28   10.52    0.00   64.35    0.00    0.00
                    lux4_LO WT1_LO WT2_LO WT3_LO WT4_LO
comp230955_c0_seq10   17.45  16.59   0.00   0.00   0.00
comp135535_c0_seq1     0.00   0.00   1.00   1.00   0.00
comp222610_c5_seq4     3.78   5.43  11.87   6.01   1.99
comp227842_c8_seq3   125.63  45.37  34.13 103.42  67.96
comp230019_c3_seq1   142.03  70.95  70.54  70.93 142.23
comp215198_c1_seq3    21.05  17.74   0.00  19.66   0.00

> dim(data)
[1] 171471     12

> vf = read.table("vf_IDs", header=T, sep="\t")
> remove= vf$contig_id # list of rownames I would like to remove from file "data"

> head(remove)
[1] comp168081_c0_seq1  comp168081_c1_seq1  comp168081_c2_seq1
[4] comp168081_c3_seq1  comp3015455_c0_seq1 comp13879_c0_seq1  
78 Levels: comp105265_c0_seq1 comp105265_c1_seq1 ... comp80665_c0_seq1

> datawithoutVF = data[ !(rownames(data) %in% remove), ]
> dim(datawithoutVF)
[1] 171417     12

# The dimensions of the new table, "datawithoutVF" is the same as the original "data". So it seems any row has been deleted. I checked then if at least one of the rowname of the list was in the original data.

> any(row.names(data) == 'comp168081_c2_seq1')
[1] TRUE

# But checking in the "fileterd" table I don't find it.

> any(row.names(datawithoutVF) == 'comp168081_c2_seq1')
[1] FALSE

It seems that it is removing the rownames but not the actual row. I guess i am missing something important but I do not understand what is going on.

How could I remove the entire row by its rowname?

Many thanks for your help!

R • 60k views
ADD COMMENT
0
Entering edit mode

datawithoutVF does have fewer rows than data in your example (54 fewer rows, to be exact).

ADD REPLY
0
Entering edit mode

True!! Oh dear, I am dyslexic and did not see it! Thanks to point it out :)

No problem then.

ADD REPLY
2
Entering edit mode
9.8 years ago
silvi_free88 ▴ 50

SOLVED:

All the options work!

datawithoutVF = data[which(rownames(data) %nin% remove), ]
datawithoutVF = data[!row.names(data)%in%remove,]
datawithoutVF = data[ !(row.names(data) %in% remove), ]

Thanks to all!

ADD COMMENT
0
Entering edit mode

Make sure to upvote the solutions that worked for you.

ADD REPLY
0
Entering edit mode
9.8 years ago
bmpbowen ▴ 40

Haven't tested it with your data, but try using the %nin% command from the Hmisc package:

install.packages("Hmisc")
library("Hmisc")
datawithoutVF = data[which(rownames(data) %nin% remove), ]
ADD COMMENT
0
Entering edit mode

Hi, thanks for the replay. I tried and I found the same problem:

> datawithoutVF = data[which(rownames(data) %nin% remove), ]
> dim(datawithoutVF)
[1] 171417     12
> any(row.names(datawithoutVF) == 'comp168081_c2_seq1')
[1] FALSE

However, I tried also the negative command, like this:

> datawithoutVF = data[-which(rownames(data) %nin% remove), ]
> dim(datawithoutVF)
[1] 54 12

> any(row.names(datawithoutVF) == 'comp168081_c2_seq1')
[1] TRUE

In this case it seems that is removing the rows but keeping only the ones in the list.

ADD REPLY
0
Entering edit mode

What about

datawithoutVF = data[-which(rownames(data) %in% remove), ]
ADD REPLY
0
Entering edit mode
9.8 years ago
edwards • 0

I think you need to put in row.names() instead of rownames(). Also the parentheses may be throwing it off. Try this instead.

datawithoutVF = data[!row.names(data)%in%remove,]
ADD COMMENT
0
Entering edit mode

Thanks for the reply! Not working though:

> datawithoutVF = data[!row.names(data)%in%remove,]
> dim(datawithoutVF)
[1] 171417     12
ADD REPLY
0
Entering edit mode

rownames() and row.names() will typically return the same thing. row.names() is a more extensible method, though.

ADD REPLY

Login before adding your answer.

Traffic: 900 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6