How to execute one command on all files within a directory
1
0
Entering edit mode
2.6 years ago
Kumar ▴ 170

Hi, I have several .fastq (R1 and R2) files in a directory. For example files: 9-Lmm05-P1413070_S9_L001_R1_001.fastq.gz, 9-Lm05-P1413070_S9_L001_R2_001.fastq.gz. I want to run the following command for the all fastq files in the directory and generate separate output directories for each pair (R1 and R2) by their prefix name (e.g. 9-Lmm05-P1413070).

$spades.py -k 21,33,55,77 -1 file1_R1_001.fastq.gz -2 file2_R2_001.fastq.gz -o Assembly

Could you please help to suggest a perl/python script to perform that.

Thank you,

Python FASTQ Perl • 1.4k views
ADD COMMENT
0
Entering edit mode

What have you tried?

ADD REPLY
0
Entering edit mode

I was trying as following but I think it was not correct. I see where it was wrong as asalimih suggesting.

for f in /home/R*
do
    spades.py -k 21,33,55,77 -1 $f -2 $f -o ${f/_R1*/\/Assembly}
done

Thank you for your help.

ADD REPLY
0
Entering edit mode
2.6 years ago
asalimih ▴ 60

Here is a bash solution. I assume the Assembly in your example is actually the output file so I replaced it according to your explanation.

for f in *_R1_*
do
    $spades.py -k 21,33,55,77 -1 $f -2 ${f/R1/R2} -o ${f/_R1*/\/Assembly}
done

${f/R1/R2} is equivalent to replacing R1 with R2 in the name of the file.

ADD COMMENT
0
Entering edit mode

Thank you very much for you help.

ADD REPLY
0
Entering edit mode

Hi, I have a follow-up question. I am running this script and getting all the output files in the directory "Assembly". However, I am looking to add a prefix in the all files name in the Assembly folder according to the input fastq (sample name). For example, my sample name is Sal521_R1.fastq.gz, Sal521_R2.fastq.gz and I want to add Sal521 in all the output files in the "Assembly" folder such as Sal521_scaffolds.fasta, Sal521_contigs.fasta etc. Thank you!

ADD REPLY
0
Entering edit mode

assuming you have a directory structure of this form:

./A
./A/Assembly/a.txt
./A/Assembly/b.txt
./B
./B/Assembly/g.txt
./B/hh.txt
./bbb.txt

The following code may do what you want:

find . -mindepth 2 -maxdepth 3 -type f -path '*Assembly*/*' | while read line 
do
   mv $line $(sed -E 's/\.\/(.+)\/Assembly\/(.*)/\.\/\1\/Assembly\/\1_\2/g' <<< "$line")
done

Output:

./A
./A/Assembly/A_a.txt
./A/Assembly/A_b.txt
./B
./B/Assembly/B_g.txt
./B/hh.txt
./bbb.txt

This code only affects the files which are in an Assembly folder and attach the name of the previous folder of the Assembly folder to each of files inside the Assembly folder.

ADD REPLY

Login before adding your answer.

Traffic: 2774 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6