i have 5000 files (name as: tni00001.keg, eco00001.keg etc) which contains underscre sign in 2nd column inside the many files (W909_00110). after the underscore sign, the number (i.e.00110) is actually represent the enzyme ID but some file out of 5000 donot contain this underscore sign in between the IDs.so now i want to extract those files names (like abc00001.keg) which don’t contains underscore sign _ in 2nd column of each files.
Example: keg file look like this from inside
D W909_00110 glk; glucokinase K00845 glk; glucokinase [EC:220.127.116.11] D W909_17905 pgi; glucose-6-phosphate isomerase K01810 GPI; glucose-6-phosphate isomerase [EC:18.104.22.168] D W909_19315 6-phosphofructokinase K00850 pfkA; 6-phosphofructokinase 1 [EC:22.214.171.124]
Is absence of
_consistent for all lines in these files or only some records may not have
total 5000 files and some files not have this underscore sign in between the IDs. how many files that donot conatins this sign inside, that's what i want to know and extract all these files names.
Clarification I was asking for is do all records in that file of interest not have a
_or only some.
Are these fields all tab separated?
may I ask what the ultimate goal is? I.e. why are you specifically interested in files where the underscore is missing?