Finding which gene is missing in a column.
0
0
Entering edit mode
3.4 years ago
3kixintehead ▴ 10

I'm working on a DESeq2 pipeline and trying to plot some genes of significance by padj value. I'm using bioMart to exchange Ensemble IDs for hgnc symbols. But, when I do the exchange I end up with fewer rows, so I then can't match the symbols to my column of ENS ids. I think this is because some ENS ID isn't in the biomart package for some reason. I plan to manually replace this, but first I have to find which one is missing.

I know I must be missing something basic about how R works, but I'm still relatively new to R and I'm not sure what exactly is going on. Following this formula from: https://stackoverflow.com/questions/13774773/check-whether-values-in-one-data-frame-column-exist-in-a-second-data-frame gets me "NULL"

A$C[!A$C %in% B$C]
[1] 2   # returns all values of A$C that are NOT in B$C
v---my code---v
filter_df_padj$rownames[!filter_df_padj$rownames %in% gns_padj$ensemble_gene_id]

I also tried to assign each column to a vector, then compare as in this guide: https://www.r-bloggers.com/2017/03/match-function-in-r/ and ended up getting "NULL".

Thanks for any help.

v1 <- filter_df_padj$rownames
v2 <- gns_padj$ensemble_gene_id
RNA-Seq R • 573 views
ADD COMMENT
0
Entering edit mode

Try setdiff. Ensure the columns are strings and not factors. Google these terms to understand more.

ADD REPLY
0
Entering edit mode

Are you sure that you mean $ensemble_gene_id, and not $ensembl_gene_id?

ADD REPLY

Login before adding your answer.

Traffic: 1687 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6