Entering edit mode
3.9 years ago
vborincenturion
•
0
I have a file 1 with only IDs information:
BAC0713
BAC0713
BAC0713
BAC0755
BAC0755
BAC0353
BAC0353
.....
And I have a file 2 with more information:
BAC0713 copM A0A0F6QDN6 Synechocystis sp. PCC 6803 Plasmid Copper (Cu)
BAC0755 galE A0A1Z2RUL8 Uncultured bacterium Chromosome Cetyltrimethylammonium bromide (CTAB) [class: Quaternary Ammonium Compounds (QACs)]
BAC0353 smdA A7VN01 Serratia marcescens Chromosome 4,6-diamidino-2-phenylindole (DAPI) [class: Diamidine], Hoechst 33342 [class: Bisbenzimide]
....
I want to join the information of file 2 in file 1. How can I do that?
The example output is:
BAC0713 copM A0A0F6QDN6 Synechocystis sp. PCC 6803 Plasmid Copper (Cu)
BAC0713 copM A0A0F6QDN6 Synechocystis sp. PCC 6803 Plasmid Copper (Cu)
BAC0713 copM A0A0F6QDN6 Synechocystis sp. PCC 6803 Plasmid Copper (Cu)
BAC0755 galE A0A1Z2RUL8 Uncultured bacterium Chromosome Cetyltrimethylammonium bromide (CTAB) [class: Quaternary Ammonium Compounds (QACs)]
BAC0755 galE A0A1Z2RUL8 Uncultured bacterium Chromosome Cetyltrimethylammonium bromide (CTAB) [class: Quaternary Ammonium Compounds (QACs)]
BAC0755 galE A0A1Z2RUL8 Uncultured bacterium Chromosome Cetyltrimethylammonium bromide (CTAB) [class: Quaternary Ammonium Compounds (QACs)]
BAC0353 smdA A7VN01 Serratia marcescens Chromosome 4,6-diamidino-2-phenylindole (DAPI) [class: Diamidine], Hoechst 33342 [class: Bisbenzimide]
BAC0353 smdA A7VN01 Serratia marcescens Chromosome 4,6-diamidino-2-phenylindole (DAPI) [class: Diamidine], Hoechst 33342 [class: Bisbenzimide]
Assuming that file 1 has repeats and file2 not
Yea, the file1 has repeats IDs, but the file2 doesn't have. The command join -1 1 -2 1 <(sort -k1 file2) <(uniq file1|sort -k 1) don't do anything.
I tried myself, It worked fine for me the example output I showed is what I get. How did you use it and what is the delimiter?
file2 had some duplicate IDs, I took them out and it worked. Thank you