Hello Everyone!
I'm trying to run a loop to trim my rnaseq reads in bbduk. However, I can't seem to input files from a specific directory nor output them to a directory.
Probably this is very easy to solve. I'm currently using:
for i in `ls -1 /home/gabriel.gama/Dados_CD_genomics/TrueSeq_dezembro/*_1.fq.gz | sed 's/_1.fq.gz//'`
do
bbduk.sh -Xmx1g in1=$i\_1.fq.gz in2=$i\_2.fq.gz out1=/home/gabriel.gama/Análises/Teste1/bbduk/$i\_clean_1.fq.gz out2=/home/gabriel.gama/Análises/Teste1/bbduk/$i\_clean_2.fq.gz ref=/home/gabriel.gama/bbduk/bbmap/resources/adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo qtrim=r trimq=10 maq=10
done
I'm not getting the sample names as output, even tough the command run. I'm getting the absolute path to file names as output
Maybe I should use something such as
for a in 'basename $i'
to get the basename of the file, and then reference it as such:
Please do not use spaces in folder names. It is just simpler to use a _ when you feel like using a space. It would be better name my machine to my_machine.
There can be no spaces in bbduk.sh options. You seem to have a space between ref= and the directory after it.
Try this:
for i in `ls -1 /home/gabriel.gama/Dados_CD_genomics/TrueSeq_dezembro/*_1.fq.gz`; \
do dname=$(dirname ${i}); name=$(basename ${i} _1.fq.gz); \
bbduk.sh -Xmx1g in1=${dname}/${name}_1.fq.gz in2=${dname}/${name}_2.fq.gz \
out1=/home/gabriel.gama/Análises/Teste1/bbduk/${name}_clean_1.fq.gz out2=/home/gabriel.gama/Análises/Teste1/bbduk/${name}_clean_2.fq.gz \
ref=/home/gabriel.gama/bbduk/bbmap/resources/adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo qtrim=r trimq=10 maq=10 ;
done
I removed spaces where possible, but I still can't reference to the sample name, just absolute path. Because of that I can't output the fastq correctly :(
Come on now, you can't expect us to change every single line of code. It helps if you put some effort of your own, which here would be adding a fixed directory part to in1 and in2:
I think code is a little bit complex as you are getting sample read name (ls), converting to the dir name (sed) and then adding read information (in loop). Please check the following code and see if you can improve on it. I have removed the paths, rephrased the code and see if it makes sense. (please add paths wherever applicable). This code works when wherever the reads and adapters are kept in the same directory:
what error are you getting?
I can't get the "samplename.fq" as output, just absolute paths