Question: How to merge column text files using terminal command?
0
gravatar for Genosa
24 months ago by
Genosa100
WWW
Genosa100 wrote:

Sorry if this question is really basic, but I'm at loss of trying to find a simple way to join / merge data of 2 cuffdiff output files

in one of the tracking output file, the columns listed are: (let's call this file 1)

tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status

In the other file containing differential gene expression analysis, the columns listed are: (let's call this file 2)

test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant

I would like to analyse the data from file 1, however the columns only give the gene ID (ENSG00000xxx) format without the gene name. I would like to have a way to merge the test ID (identical as tracking_id in file 1) and gene_id from file 2 with file 1.

What is the terminal command to do this please? I also need to retain the information and format of the remaining columns of file 1 for further analysis.

I tried the command 'join' However, my joined file contains all the columns from tracking_id to gene_id and rest all merged into a SINGLE column. Is there a way I can stop terminal from doing this please?

linux terminal rnaseq • 1.5k views
ADD COMMENTlink modified 24 months ago by michael.ante2.9k • written 24 months ago by Genosa100

I have never worked with cuffdiff output files,maybe you can use {paste}?

ADD REPLYlink written 24 months ago by zjhzwang180
2
gravatar for b.nota
24 months ago by
b.nota5.4k
Netherlands
b.nota5.4k wrote:

Best way is to use the R terminal here for. Use the function merge

Check the manual page with:

?merge
ADD COMMENTlink written 24 months ago by b.nota5.4k
1
gravatar for michael.ante
24 months ago by
michael.ante2.9k
Austria/Vienna
michael.ante2.9k wrote:

Hi Genosa,

You can specify in the join command on which columns you want to join:

join -1 1 -2 2 file1 file2

If you only need selected columns in your output, you can control this by -o (here joining-column, second column of first file and 5th column of second file):

join -1 1 -2 2 -0 0,1.2,2.5 file1 file2

Stackoverflow has a couple of nice examples like this one.

Cheers,

Michael

ADD COMMENTlink modified 24 months ago • written 24 months ago by michael.ante2.9k

Hi Michael, thanks! But the joine data all appear in 1 column rather than separated columns. Is there a way that I can do to separate them? Thank you !

ADD REPLYlink modified 24 months ago • written 24 months ago by Genosa100

AFAIK, the standard delimiter of join is a blank space rather than a tab. With -t, you can change it.

ADD REPLYlink written 24 months ago by michael.ante2.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1018 users visited in the last hour