Merge two data sets

Entering edit mode

6.9 years ago

saj98 ▴ 140

Hello everyone

I have two RNA seq experiments, one done on tissue and the second one done on cells. I am looking to do cross correlation between the two experiments. I have CSV file for each experiment. I am interested to merge the two files in one CSV file. How I can merge the two files with matching the gene IDs? I am looking to do this step because I am interested to calculated the correlation of gene expression between invitro and invivo.

Thanks for help

RNA-Seq R • 2.6k views

ADD COMMENT • link 6.9 years ago by saj98 ▴ 140

Entering edit mode

You can read CSV files into R separately, and use the gene IDs as rownames of data frame, then you can use cbind to merge two data frame.

ADD REPLY • link 6.9 years ago by zjhzwang ▴ 180

Entering edit mode

R also has a merge function that is useful, and you can merge by gene ID.

ADD REPLY • link 6.9 years ago by theobroma22 ★ 1.2k

Entering edit mode

Hello I used merge command, but I got this error message, any help or suggestion

> CP1.df <- read.csv(file.choose(), header = TRUE, sep = ",")
> CP2.df <- read.csv(file.choose(), header = TRUE, sep = ",")
> X=merge(CP1.df,CP2.df)
Error: cannot allocate vector of size 4.9 Gb
In addition: Warning messages:
1: In expand.grid(seq_len(nx), seq_len(ny)) :
  Reached total allocation of 20388Mb: see help(memory.size)
2: In expand.grid(seq_len(nx), seq_len(ny)) :
  Reached total allocation of 20388Mb: see help(memory.size)
3: In expand.grid(seq_len(nx), seq_len(ny)) :
  Reached total allocation of 20388Mb: see help(memory.size)
4: In expand.grid(seq_len(nx), seq_len(ny)) :
  Reached total allocation of 20388Mb: see help(memory.size)
>

ADD REPLY • link updated 6.9 years ago by WouterDeCoster 47k • written 6.9 years ago by saj98 ▴ 140

Entering edit mode

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY • link 6.9 years ago by WouterDeCoster 47k

Entering edit mode

did you try passing in a by argument to merge

ADD REPLY • link 6.9 years ago by russhh 5.7k

Entering edit mode

check join in bash

ADD REPLY • link 6.9 years ago by gangireddy ▴ 160

Entering edit mode

Hello I got the figured out how to solve it, and I am sharing it with you

exporttab <- merge(x=dwd_nogap, y=dwd_gap, by.x='x1', by.y='x2', fill=-9999), and it was very useful.

ADD REPLY • link 6.9 years ago by saj98 ▴ 140

Entering edit mode

Do you see why it initially failed, expanding to > 20Gb, though?

ADD REPLY • link 6.9 years ago by russhh 5.7k