merge multiple data with multiple columns?
1
0
Entering edit mode
5.9 years ago
star ▴ 350

I have multiple data that in all of them the first 3rd columns are the same and I like to merge those data based on these three columns. I can do it with merge command in R but I like to do it in Linux (I used joint command but it does not work well).

data1:

 chr1   724060  724400  SK       chr1  724206   725561  peak_1  24  .  2   194
 chr1   729399  731900  sun         .     -1       -1      .    .   .   .   0

data2:

 chr1   724060  724400  sk            .   -1       -1      .    .   .   .   0
 chr1   729399  731900  sun         chr1 724206 725561  peak_10 24  .  5   104

output:

 chr1   724060  724400  SK       chr1  724206   725561  peak_1  24  .  2   194                .   -1       -1      .    .   .   .   0
 chr1   729399  731900  sun         .     -1       -1      .    .   .   .   0   chr1 724206 725561  peak_10 24  .  5   104
linux script • 1.2k views
ADD COMMENT
1
Entering edit mode

Include exact join command you have tried.

ADD REPLY
0
Entering edit mode

Just I did , join data1.bed data2.bed > data.bed

ADD REPLY
0
Entering edit mode

For join to work, the files must be sorted (in the same order), and you have to tell join which field you want it to do the joining by.

ADD REPLY
0
Entering edit mode
5.9 years ago
Joe 21k

I'm not sure I fully understand the question, but I'm not thinking mega clearly. Is this the desired outcome?

I'm confused by your desired output because I don't see the string "RA" anywhere in the input files.

1.

First I had to manipulate the whitespace in your example data so that it was properly tabulated.:

perl -p -e 's/ +/\t/g' file1.txt > file1.tsv 
# and the same for file2

2.

If your data is already correctly ordered top-to-bottom (you haven't stated), then this will work I think:

paste file1.tsv <(cat file2.tsv | cut -d$'\t' -f 4-)

which yields:

$ paste file1.tsv <(cat file2.tsv | cut -d$'\t' -f 4-)
chr1    724060  724400  SK  chr1    724206  725561  peak_1  24  .   2   194 sk  .   -1  -1  .   .   .   .   0
chr1    729399  731900  sun .   -1  -1  .   .   .   .   0   sun chr1    724206  725561  peak_10 24  .   5   104
ADD COMMENT

Login before adding your answer.

Traffic: 1127 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6