10 weeks ago by
Seattle, WA USA
grep do exact-word matches from a file of strings, use
$ grep -w -F -f 1.txt 2.gff > 3.gff
-w option does word searches using regular expressions. The
-F option modifies this to do exact-string matching.
-w on its own will consume a lot of memory, because it will look for words in
1.txt that are contained in substrings of
In other words, if you have a string like
1.txt, then using
-w on its own will match a larger set of strings that contain
12345 (what you want),
012345, etc. found in
The first match to
12345 is desired, but all other matches that contain
1234567 etc. are probably not what you want.
So by combining
-F, you get an exact match for the string you provide. So
12345 will only provide a hit on the word
12345, and not any other matches where
12345 is a prefix or suffix or substring of something else.
As a bonus, using
grep consume a great deal less memory. Regular expressions use lots of memory, but string matching does not need to.