Having trouble using wildcard
1
0
Entering edit mode
3.4 years ago
nattzy94 ▴ 50

I am trying to use the bbsplit function for a number of files. I have done:

for i in {17..34}; do
bash bbmap/bbsplit.sh \
in1=./temp_expt/Sample_MBM1${i}/MBM1${i}_R1_001.fastq \
in2=./temp_expt/Sample_MBM1${i}/MBM1${i}_R2_001.fastq \
ref=./MG1655.fasta,./MGH78578.fasta,./GAPDH.fasta \
basename=out_%.fq outu1=clean1.fq outu2=clean2.fq ambig2=toss
done


However, I keep running into a cannot find file error like this:

Can't find file ./temp_expt/Sample_MBM118/MBM18*_R1_001.fastq

shell loop command line • 1.1k views
0
Entering edit mode

What happens when you type

  ls ./temp_expt/Sample_MBM118/MBM18*_R1_001.fastq


? I reckon those files don't exist

0
Entering edit mode

I get './temp_expt/Sample_MBM118/MBM118-TTACGTGC-CAAGGTCT_S66_L008_R1_001.fastq'

NB: I realize the previous command is missing a '1' and should be MBM1${i} but the problem still persists. ADD REPLY 0 Entering edit mode well, this is confusing. Your code is supposed to throw a different error. copy/pasted the code from OP: for i in {17..34}; do bash bbmap/bbsplit.sh in1=./temp_expt/Sample_MBM1${i}/MBM${i}_R1_001.fastq in2=./temp_expt/Sample_MBM1${i}/MBM1${i}_R2_001.fastq ref=./MG1655.fasta,./MGH78578.fasta,./GAPDH.fasta basename=out_%.fq outu1=clean1.fq outu2=clean2.fq ambig2=toss; done  R1 variable (in1)= MBM${i}_R1_001.fastq R2 variable (in2) =MBM1${i}_R2_001.fastq if your variable is 17, R1 is MBM17_R1_001.fastq and R2 is MBM117_R2_001.fastq. Is this a typo? from one of the replies (from OP) file: ./temp_expt/Sample_MBM118/MBM118-TTACGTGC-CAAGGTCT_S66_L008_R1_001.fastq exists. This means for this file, R1 variable is MBM1${i}_R1_001.fastq not, MBM${i}_R1_001.fastq. Either variable needs to be changed or file name needs to be changed. ADD REPLY 0 Entering edit mode Yes, the original command had a typo but I've corrected it and the file still cannot be found. I've edited the initial post to reflect this. ADD REPLY 0 Entering edit mode I think OP code still has a problem (assuming that it is updated): in1=./temp_expt/Sample_MBM1${i}/MBM1${i}_R1_001.fastq.. this would look only for MBM118_R1_001.fastq under sample_MBM118 folder, but not for MBM18*_R1_001.fastq. Error should be some thing like this: Can't find file ./temp_expt/Sample_MBM118/MBM118*_R1_001.fastq not Can't find file ./temp_expt/Sample_MBM118/MBM18*_R1_001.fastq as MBM1 is fixed. ADD REPLY 0 Entering edit mode You can check the file existence with following code. Please change the path as per your convenience: for i in {17..34}; do if [ -e ./MBM1${i}_R1_001.fastq ]; then echo "exists";fi;done

0
Entering edit mode

FYI, I've restructured the command from a one-liner just to make it a little easier for people to debug since it was quite long.

Can't find file ./temp_expt/Sample_MBM118/MBM18*_R1_001.fastq This error would suggest that it's not interpreting the wildcard in the shell and is looking for a literal *.

Do you have any more information or other error messages to go on, because there doesn't look to be a correspondence between the code and the error at the moment..

0
Entering edit mode
3.4 years ago

Your search patter doesn't match the file names.

in1=./temp_expt/Sample_MBM1${i}/MBM1${i}_R1_001.fastq \
in2=./temp_expt/Sample_MBM1${i}/MBM1${i}_R2_001.fastq \


while your file names look like

MBM118-TTACGTGC-CAAGGTCT_S66_L008_R1_001.fastq


If you are sure there is only on pair of files per sample id, you could simply:

in1=./temp_expt/Sample_MBM1${i}/MBM1${i}-*_R1_001.fastq \
in2=./temp_expt/Sample_MBM1${i}/MBM1${i}-*_R2_001.fastq \


But if there is a chance that there are multiple with e.g. different tag sequence, then you should do it a bit differently.