Hi, I have two files, one file (file1.txt) has snp coordinates and rs IDs. This file is obtained from Hg19 file.
1:10150:C:T rs371194064
1:10165:A:AC rs796884232
1:10177:A:AC rs367896724
1:10177:A:C rs201752861
1:10180:T:C rs201694901
1:10199:A:T rs905327004
1:10228:TA:T rs143255646
The other file (file2.txt) that I have has coordinates only and looks like this
1:838916:A:T
1:839461:T:C
1:839495:G:T
1:839528:A:G
1:839529:T:G
1:840353:G:C
I want to merge both files based on identical coordinates between them. The first file is a large file where as the second file has a few million snps. i want to find the snp rsIDs for the second file by merging it with the first
I have used this command but it generates a file that has everything in it including the content of file 2 but I only want the matching rows between them.
paste -d " " file1.txt file2.txt > Final_rsIDs.txt
join
needs file to be sorted. If program throws an error on sorting, sort both the files and join.