Question: Merging two files based on the identifier column (gene symbols)
0
gravatar for mohammedtoufiq91
13 days ago by
mohammedtoufiq9150 wrote:

Hi,

I have two different *.csv files with different column headers except one column, i.e, one with the gene symbols and expression data (samples), and the other with the gene symbols and phenotypic data/attributes, in both the files, one column (gene symbols) is same. I would like to merge both the files based on mapping with the gene symbol column and save all the data in one file for further data analysis. I would like to know how this could be done.

Thank you,

Toufiq

ADD COMMENTlink modified 13 days ago by Jean-Karim Heriche20k • written 13 days ago by mohammedtoufiq9150
1

Have you read the help page of the merge function?

?merge
ADD REPLYlink written 13 days ago by Benn7.7k

Thank you so much. @Benn

ADD REPLYlink modified 13 days ago • written 13 days ago by mohammedtoufiq9150

Cross-posted: https://support.bioconductor.org/p/124514

ADD REPLYlink written 13 days ago by ATpoint23k
3
gravatar for Jean-Karim Heriche
13 days ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche20k wrote:

This can be done in the terminal with the join utility (sort the files on gene symbol first), e.g. join -a1 -a2 file1.csv file2.csv

The -a option is used to keep unpairable lines from the corresponding file, i.e. in case a gene symbol is in one file but not the other.

ADD COMMENTlink written 13 days ago by Jean-Karim Heriche20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1965 users visited in the last hour