Entering edit mode
4.7 years ago
Kumar
▴
170
I have a list of words in a file (file 1) and large genomic gff information in another file (file 2). I want to grep information from file 2 of matching words from file 1 in output file (file 3) and also want to generate a file of words which are not matching in file 2. Please see the example below:
File 1:
NODE_55_length_30858_cov_27.421
NODE_54_length_29424_cov_167.508
NODE_22_length_84792_cov_25.8257
NODE_38_length_25225_cov_29.0986
File 2: (a large gff file, not showing )
##gff-version 3
##sequence-region NODE_1_length_232048_cov_20.4417 1 232048
...................................
OUTPUT 1:
##sequence-region NODE_54_length_29424_cov_167.508 1 29424
NODE_54_length_29424_cov_167.508 Prodigal:2.6 CDS 454 975 . - 0 ID=LGE3207_04049;Name=insJ;gene=insJ;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:RefSeq:G7776-MONOMER;locus_tag=LGE3207_04049;product=IS150 protein InsA
NODE_54_length_29424_cov_167.508 Prodigal:2.6 CDS 1026 1745 . - 0 ID=LGE3207_04050;inference=ab initio prediction:Prodigal:2.6;locus_tag=LGE3207_04050;product=hypothetical protein
OUTPUT 2: (not found words)
NODE_55_length_30858_cov_27.421
NODE_22_length_84792_cov_25.8257
NODE_38_length_25225_cov_29.0986