Question: Running mutltiple Fastq files with Hisat2 in a loop
0
gravatar for n.tear
11 months ago by
n.tear10
n.tear10 wrote:

I am trying to loop paired end reads in hisat2 with an output of sorted bam using a pipe.

I am a novice at coding but am keen to improve my skills by learning how to loop.

I have created this loop using the answers from Run mutltiple Fastq files on Hisat2

 for f in `ls *.fastq.gz | sed 's/_[12].fastq.gz//g' | sort -u`
do
echo hisat2 -x /mnt/z/NathanHaffordTear/RNAseqproject/RNAseqanalysis/hg38_index -1 ${f}_1.fastq.gz -2 ${f}_2.fastq.gz | samtools sort -o ${f}.hisat.sorted.bam
done

For some reason the loop does work without the pipe but not with. How can i configure this loop to work with the piping as I do not want to have the intermediate files?

Many thanks in advance

rna-seq • 483 views
ADD COMMENTlink modified 11 months ago by ATpoint34k • written 11 months ago by n.tear10

Is the echo supposed to precede the call to hisat2?

ADD REPLYlink written 11 months ago by russhh5.3k

I added it in to check the code before running.

ADD REPLYlink written 11 months ago by n.tear10
$ ls *.gz
test_1.fastq.gz  test_2.fastq.gz

--

for f in `ls *.fastq.gz | sed 's/_[12].fastq.gz//g' | sort -u`
do
echo "hisat2 -x /mnt/z/NathanHaffordTear/RNAseqproject/RNAseqanalysis/hg38_index -1 ${f}_1.fastq.gz -2 ${f}_2.fastq.gz | samtools sort -o ${f}.hisat.sorted.bam"
done

output:

hisat2 -x /mnt/z/NathanHaffordTear/RNAseqproject/RNAseqanalysis/hg38_index -1 test_1.fastq.gz -2 test_2.fastq.gz | samtools sort -o test.hisat.sorted.bam

OP is supposed to run:

for f in `ls *.fastq.gz | sed 's/_[12].fastq.gz//g' | sort -u`
do
echo "parising f...wait..." && hisat2 -x /mnt/z/NathanHaffordTear/RNAseqproject/RNAseqanalysis/hg38_index -1 ${f}_1.fastq.gz -2 ${f}_2.fastq.gz | samtools sort -o ${f}.hisat.sorted.bam 
done

OP can also try parallel:

parallel --plus --dry-run  "hisat2 -x /mnt/z/NathanHaffordTear/RNAseqproject/RNAseqanalysis/hg38_index -1 {} -2 {=s/1/2/=} | samtools sort -o {=s/_1.fastq.gz//=}.hisat.sorted.bam" ::: *_1.fastq.gz

output:

hisat2 -x /mnt/z/NathanHaffordTear/RNAseqproject/RNAseqanalysis/hg38_index -1 test_1.fastq.gz -2 test_2.fastq.gz | samtools sort -o test.hisat.sorted.bam
ADD REPLYlink modified 11 months ago • written 11 months ago by cpad011213k
2
gravatar for ATpoint
11 months ago by
ATpoint34k
Germany
ATpoint34k wrote:

Remove the echo as you want to execute rather than print the command. If you want "status updates" do

echo $f && hisat2 (...)
ADD COMMENTlink modified 11 months ago • written 11 months ago by ATpoint34k

Hi thanks for the reply

Im aware of this but want to make sure the code works with echo before running

running my code I get the error:

[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting
[W::sam_read1] Parse error at line 1
samtools sort: truncated file. Aborting

I will try with the added echo $f && hisat2 (...) and let you know how it goes

thanks

ADD REPLYlink modified 11 months ago • written 11 months ago by n.tear10

Of course because hisat2 does not produce output and therefore samtools does not get any valid stdin. Remove the echo and maybe put a head like hisat2 | head -n 1000 | samtools sort to get an idea if it works properly (and it should, the command is ok).

ADD REPLYlink written 11 months ago by ATpoint34k

Thanks that makes perfect sense. doh

The command you suggested before seems to be working great, i like getting the updates

ADD REPLYlink written 11 months ago by n.tear10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1246 users visited in the last hour