Question

usearch length trimming

0

Entering edit mode

6.3 years ago

n.elsahly • 0

I'm trying to use the -fastx_truncate in usearch to create truncated versions of the reads. I have 10 samples so I used the following script on nano, but it is not working

#!/bin/bash

cd ../fq

usearch -fastx_truncate *.fastq -trunclen 200 -fastqout ../out/truncated.fq

there is something wrong with *.fastq because when I change it to a file name, the command works. I wanted to combine all the truncated versions in one file that is truncated.fq

16s sequencing usearch uparse • 2.4k views

ADD COMMENT • link 6.3 years ago by n.elsahly • 0

0

Entering edit mode

First I do not know is it allowed to use multiple fastq files as input?

if so

what about trying `ls .fastq` or `ls -d $PWD/.fastq` instead of *.fastq

ADD REPLY • link 6.3 years ago by Medhat 9.7k

0

Entering edit mode

I believe it is acceptable. In -mergepairs command you can use multiple fastq and they all ended up in one output .fastq merge file

ADD REPLY • link 6.3 years ago by n.elsahly • 0

0

Entering edit mode

I can do the trimming for each sample separately. But how can I pool the results to get one OTU-table as an output?

https://www.drive5.com/usearch/manual/pool_samples.html

I am not an expert in the field, I still learning how to use the shell

Thanks for your time.

ADD REPLY • link 6.3 years ago by n.elsahly • 0

0

Entering edit mode

did you tried `ls -d $PWD/.fastq` ?

ADD REPLY • link 6.3 years ago by Medhat 9.7k

0

Entering edit mode

yes, it gave: cannot open, no such file directory

ADD REPLY • link 6.3 years ago by n.elsahly • 0

0

Entering edit mode

For the following to work I am assuming that usearch appends the new results to the output file.

for i in `ls -1 *.fastq | sed 's/.fastq//'; do usearch -fastx_truncate $i.fastq -trunclen 200 -fastqout ../out/truncated.fq; done

If that is not the case then do:

for i in `ls -1 *.fastq | sed 's/.fastq//'; do usearch -fastx_truncate $i.fastq -trunclen 200 -fastqout ../out/$i_trunc.fq; done
cat ../out/*trunc.fq > truncated.fq
rm -f ../out/*trunc.fq

ADD REPLY • link 6.3 years ago by GenoMax 141k

0

Entering edit mode

both gave me unexpected EOF (end of file) while looking for matching

ADD REPLY • link 6.3 years ago by n.elsahly • 0

0

Entering edit mode

I doubt that it was while matching. This is a simple listing of all files that end in .fastq. Did you check if the truncated.fq file was produced?

ADD REPLY • link 6.3 years ago by GenoMax 141k

0

Entering edit mode

I did it in another way; I trimmed each separately, the I used the cat option to pool the fwd and rev reads for each sample.

Now I need to use the cat option again to pool them all together in one file, but I also need to add the sample name to each. -relabel @ option is only working with the -mergepairs command.

do you have any clue how can I add the sample name to each before concatenating all in one file?

ADD REPLY • link 6.3 years ago by n.elsahly • 0

0

Entering edit mode

-relabel @ option is only working with the -mergepairs command.

They you have to do it in that order.

BTW: If one (or more) of your original files was corrupt the you will get the unexpected EOF message. You should check your fastq files (use validateFiles from Jim Kent's utilities) and they re-run the loop above.

ADD REPLY • link 6.3 years ago by GenoMax 141k

0

Entering edit mode

but my reads are 700 bp MiSeq 2X300, so I can not merge them, That's why I am trying to relabel and pool by cat.

correct me if I am wrong please.

ADD REPLY • link 6.3 years ago by n.elsahly • 0

0

Entering edit mode

but my reads are 700 bp MiSeq 2X300, so I can not merge them

Not sure what that means. Did you size select fragments to be ~700 bp and then sequenced the resulting library 2 x 300? Do you expect the fragments to overlap/merge? What kind of an experiment is this?

ADD REPLY • link 6.3 years ago by GenoMax 141k

0

Entering edit mode

I know they will never merge, my question is completely different

I have trimmed sequences and I need to pool them in one file (that could be done by cat command) but the cat will not relabel the reads. I will end up having a large single file of all reads as if it is one sample.

How can I relabel the reads before pooling them in one file.

I mentioned -relabel in -mergepairs as an example, since I need to have a similar output format

ADD REPLY • link 6.3 years ago by n.elsahly • 0