Hello, I have a file containing gene names of interest (24423 genes), and another file containing the lengths to all the genes (41306 genes). I want the lengths only to the 24424 genes, but when I
grep -wf file1 file2 or even
fgrep -Fwf file1 file2, I get some excess genes, because some genes in my list may contain only the sense or the anti-sense strands, whereas if the reference file may contain both, and that is being reflected.
I want to know if there is a way to remove from the reference file (file2) all the lines that don't match?
P.S. The question is also on stackoverflow.com
But I always get MYB-AS2 and MYB-AS3
and you'll soon get some negative votes on stackoverflow because you don't show any sample of your files.
Hi, can you post example of your file1, file2 and desire output?