Question: Create files in multiple directories via for loop
0
gravatar for vinayjrao
8 weeks ago by
vinayjrao200
inStem, India
vinayjrao200 wrote:

Hi, I have around 500 directories, and I wish to first create an empty text file within each of these and then process my data files contained within sub-directories and redirect the output into the respective text files.

An example of my existing folders - exisiting/main/sub1/sub2/sub3/sub4/sub5/sub6/data.tsv.bz2

What I wish to do - create main/sub1.txt & extract data.tsv.bz2 > main/sub1.txt

I have figured how to redirect my output into the desired files, but I am facing issues with creating the .txt files.

Tha command tried for the same is for i in */; do touch $i.txt; done. This creates a file named .txt within the main folder. So I also tried for i in */ ; do cd "$i"; for j in */; do ( touch $j.vcf ) done; done, which gave me an error saying

bash: cd: main/: No such file or directory

Since I have typed the command from my existing directory, it first identifies the main directory. I want to be able to provide the name of the sub1 directory while creating sub1.txt

Is there any way to get around this problem?

Thanks in advance

linux shell • 191 views
ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by vinayjrao200
1

You didn't mention where the sub directories are. Here is the solution to OP problem:

$ find . -type f -name "*.bz2"                                  
./exisiting/main/sub1/sub2/sub3/sub4/sub5/sub6/data.tsv.bz2

$ parallel --plus --dry-run 'bzip2 -dc {} > main/{/.}' ::: $(find . -type f -name "*.bz2") 

bzip2 -dc ./exisiting/main/sub1/sub2/sub3/sub4/sub5/sub6/data.tsv.bz2 > main/data.tsv
ADD REPLYlink written 8 weeks ago by cpad011214k

As mentioned by @cpad0112 The problem statement is not very clear.

#Create random subdirectories for test
$mkdir sub{1..6}
#list the contents of the current directory
$ls -l
data.tsv.bz2
sub1/
sub2/
sub3/
sub4/
sub5/
sub6/

Probable solution

#extract the data file in each of the subdirectories with the directory name as file name
$for dir in `ls -d */`;do bzcat data.tsv.bz2 > ${dir}/${dir%/}.txt;done
$tree
├── data.tsv.bz2
├── sub1
│   └── sub1.txt
├── sub2
│   └── sub2.txt
├── sub3
│   └── sub3.txt
├── sub4
│   └── sub4.txt
├── sub5
│   └── sub5.txt
└── sub6
    └── sub6.txt
ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by Arup Ghosh2.7k

Thank you both for your response. I am sorry that my problem was not very clear. I am working from the directory named existing/, which contains subdirectories such as main/sub1/sub2/sub3/sub4/sub5/sub6/. sub6/ contains the file data.tsv.bz2. I wish to create a file called sub1.txt in the main/ directory. Technically, I would have said touch main/sub1.txt, but since I have 500 directories, say for example main1/, main2/, main13/, main27/ etc, I wish to do this using a for loop such as for i in */ ; do cd "$i"; for j in */; do ( touch $j.vcf ) done; done. Here, I tried to identify main/ with $i and sub1/ with $j

ADD REPLYlink written 8 weeks ago by vinayjrao200
1

is sub1/sub2/sub3/sub4/sub5/sub6/ a fixed string? As I understand, main is not (main 1..main 500). It would help if you could post directory tree.

ADD REPLYlink written 8 weeks ago by cpad011214k

Starting from the directory existing/

├── main/

│          └── sub1/

│                     └── sub2/

│                                 └── sub3/

│                                            └── sub4/

│                                                       └── sub5/

|                                                                  └── sub6/

|                                                                              └── data.tsv.bz2

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by vinayjrao200
1

Since OP information is incomplete, the best you can do is:

 1. cd existing
 2. find . -type f -name "*.bz2" | while read line; do echo bzip2 -dc $line ">"  ${line%%/sub2/*}.txt;done

output should look like this. Bash command above dry-runs the code:

bzip2 -dc ./main1/sub1/sub2/sub3/sub4/sub5/sub6/data.tsv.bz2 > ./main1/sub1.txt
bzip2 -dc ./main2/sub1/sub2/sub3/sub4/sub5/sub6/data.tsv.bz2 > ./main2/sub1.txt

Dry run with parallel:

$ cd existing
$ parallel --dry-run 'bzip2 -dc {} > {=s/\/sub2.*//=}.txt' ::: $(find . -type f -name "*.bz2")

bzip2 -dc ./main1/sub1/sub2/sub3/sub4/sub5/sub6/data.tsv.bz2 > ./main1/sub1.txt
bzip2 -dc ./main2/sub1/sub2/sub3/sub4/sub5/sub6/data.tsv.bz2 > ./main2/sub1.txt
ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by cpad011214k

This works fine. Thanks a lot. You may move it to an answer so that I can accept it. Thank you once again :)

ADD REPLYlink written 7 weeks ago by vinayjrao200

So, will this extract data.tsv.bz2 into sub1.txt?

I would also like to know if I can create files with touch in the way I mentioned in the post?

Thanks once again.

ADD REPLYlink written 8 weeks ago by vinayjrao200

Take a back up of one file and run one command from the dry-run on the backed-up directory/file. you would know if it works or not.

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by cpad011214k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1716 users visited in the last hour
_