Hi,
My this post is in continuation to my previous post: Compare two columns of same file as paired rows
So, I am skipping some file introductory part. As Kevin replied with two commands. The second command:
awk '!index($3, $4) { print }' test.tsv | grep -v -f matches.list
here grep -v
is supposed to do invert matching, which it is the case, but still it seems to me that it is missing those records in the output which are not present in the matches.list.
For example, my output should have this record:
CHR POS AACHANGE EFFECT
chr15 41342365 T35A D
chr15 41342366 T35N D
because matches.list has no T35, or T35A or T35N. But still this record is not present. I am unable to understand. I checked my file, for proper formatting, any invisible characters etc, all seems well. Is this grep error, or something else. I am using UBUNTU 16.04 LTS.
Any help appreciated,
Thanks in advance,
Waqas.
try to
to see what happened.
I tried your command. Instead of getting colored T35, I got following line:
in which A97 is colored. I changed my test.tsv to test_annotated now.
Second, 41346366 is also present if I run:
but it disappeared when I run your whole command (as you noted above).
you have something like 'A97' in your matches.list. That's why the line is removed.
Yes, Pierre, there is A97, thats why, the desired record had been removed.
I changed Kevin's command:
awk '!index($3, $4) { print }' test.tsv | grep -v -f matches.list
to:
awk '!index($3, $4) { print }' test.tsv | grep -w -v -f matches.list
and now the output has all the records that were removed due to that.
Thanksssss!!!!!
there is no reason to see a colored T35 in my command line.