Question: Parallel and Trim Galore
0
gravatar for jjp55
3 days ago by
jjp550
jjp550 wrote:

Hi all.

I am trying to get my code to work but it appears to have a bug somewhere and Trim_Galore is not reading the loop how I want it to. I have paired-end sequencing reads

My code is:

parallel --plus 'trim_galore --stringency 3 --paired {...}.fastq.gz {...}R2.fastq.qz' ::: *fastq.qz

My code is able to read the R1, the first of the set, but then it can't read the second. It tries to read the R2 as a file with R1R2.fastq.gz at the end rather than R2.fastq.gz. Any help is greatly appreciated.

chip-seq next-gen sequence • 51 views
ADD COMMENTlink modified 3 days ago by ATpoint40k • written 3 days ago by jjp550
0
gravatar for ATpoint
3 days ago by
ATpoint40k
Germany
ATpoint40k wrote:

I personally prefer to only give basenames to parallel and do the "name matching thing" outside of it. That makes it (for me) easier to manipulate the name string as I like.

For example:

# dummy data
touch foo.R1.fastq.gz
touch foo.R2.fastq.gz
touch bar.R1.fastq.gz
touch bar.R2.fastq.gz

$ ls *fastq.gz
bar.R1.fastq.gz bar.R2.fastq.gz foo.R1.fastq.gz foo.R2.fastq.gz

# now extract the basenames:
$ ls *R1.fastq.gz | awk -F ".R1.fastq.gz" '{print $1}'
bar
foo

# together with parallel (sort -u to ensure no duplicates)
ls *.R1.fastq.gz \
| awk -F ".R1.fastq.gz" '{print $1}' \
| sort -u \
| parallel "trim_galore --stringency 3 --paired {}.R1.fastq.gz {}.R2.fastq.gz"
ADD COMMENTlink modified 3 days ago • written 3 days ago by ATpoint40k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1946 users visited in the last hour