Mapping common identifiers between two files

0

Entering edit mode

6.0 years ago

mohammedtoufiq91 ▴ 270

Hi,

I have two different *.csv files with different number of rows and columns. Based the the merge function I was able to combine both the files based on mapping on the ProbeID column (common between both the files) and save all the data in a output file. However, I notice that even the unmapped rows are getting saved in the output file. I am only interested in the mapped IDs common between the two files. Please assist me with this.

File_1 has 33298 ProbeIDs
File_2 has 41270 ProbeIDs
Combined file has 41270 ProbeIDs

Combined<- merge(File_1, File_2, by="ProbeID")

Thank you,

Toufiq

merge mapping csv R • 1.6k views

ADD COMMENT • link 6.0 years ago by mohammedtoufiq91 ▴ 270

0

Entering edit mode

Have a look into dplyr joins. dplyr::left_join(df1, df2) keep all the rows from df1. dplyr::right_join(df1, df2) keep all the rows from df2.

ADD REPLY • link 6.0 years ago by Chirag Parsania ★ 2.0k

0

Entering edit mode

Provide reproducible example input and expected output. Your code looks fine and should only return matching rows that have common "ProbeID"s in both files, test this example:

merge(data.frame(x = 1:3, y = 11),
      data.frame(x = 2:4, z = 22), by = "x")
#   x  y  z
# 1 2 11 22
# 2 3 11 22

See this StackOverflow post for more examples and other merging options:

How to join (merge) data frames (inner, outer, left, right)

ADD REPLY • link 6.0 years ago by zx8754 12k

Login before adding your answer.