Hello, I have a file containing gene names of interest (24423 genes), and another file containing the lengths to all the genes (41306 genes). I want the lengths only to the 24424 genes, but when I
grep -wf file1 file2 or even
fgrep -Fwf file1 file2, I get some excess genes, because some genes in my list may contain only the sense or the anti-sense strands, whereas if the reference file may contain both, and that is being reflected.
I want to know if there is a way to remove from the reference file (file2) all the lines that don't match?
P.S. The question is also on stackoverflow.com
But I always get MYB-AS2 and MYB-AS3