find unpaired files
2
0
Entering edit mode
4.7 years ago
arraychip ▴ 30

In a folder, there are txt files, all with unique names. After an analysis, new files are generated, in the form of "file name_analyzed.txt". One "_analyzed.txt" file for an input file. Some files, for some unknown reason, don't generate "_analyzed.txt".

So, it looks like:

file1.txt file1_analyzed.txt /
file2.txt file2_analyzed.txt /
**file3.txt** /
file4.txt file4_analyzed.txt /
..
fileN.txt fileN_analyzed.txt

.

How can I list all the files like "file3", with no accompanying pair? Typically, there are over 40,000 files in a folder. Any command-lines to solve this problem?? Thanks

sequence • 737 views
ADD COMMENT
2
Entering edit mode
4.7 years ago
GenoMax 141k

One way.

$ ls -1
file1.txt
file1_analyzed.txt
file2.txt
file3.txt
file3_analyzed.txt
file4_analyzed.txt

# following is needed for bash 
$ shopt -s extglob 

$ comm -3 <(ls -1 *_analyzed.txt | sed 's/_analyzed.txt//' | sort) <(ls -1 !(*analyzed.txt) | sed 's/.txt//'| sort)
    file2
file4

In output column 1: file4_analyzed.txt has no corresponding plain file
In output column 2: file2 has no corresponding _analyzed.txt file

If you only have missing _analyzed.txt files then you should only get one column of output.

ADD COMMENT
0
Entering edit mode

It worked perfectly. Greatly appreciate it.

ADD REPLY

Login before adding your answer.

Traffic: 3185 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6