Question: Alignment with bowtie2 but empty .fastq files
0
gravatar for luzglongoria
5 months ago by
luzglongoria20
luzglongoria20 wrote:

Hi,

I am trying to align my .fq reads (end-pair reads) to my reference genome (a "close" related species) with this command:

bowtie2 --threads 4 --local --no-unal \
    -x /home/luz_garcia_longoria/workspace/reference_genomes/parasitereference.fasta \
    -q -k 1 --al aligned_reads.fastq \
    -1 `/home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq \
    -2 /home/luz_garcia_longoria/workspace/s21_2.fq,s22_2.fq,s23_2.fq,s24_2.fq,s25_2.fq,s31_2.fq,s32_2.fq,s33_2.fq,s34_2.fq,s35_2.fq \
    > aligned_host_parasite.sam |\
    samtools view -b -o aligned_host_parasite.bam`

BUT if I do ll :

-rw-rw-r--. 1 luz_garcia_longoria luz_garcia_longoria          62 Sep 24 17:42 aligned_host_parasite.bam
-rw-rw-r--. 1 luz_garcia_longoria luz_garcia_longoria 30494171957 Sep 25 00:07 aligned_host_parasite.sam
-rw-rw-r--. 1 luz_garcia_longoria luz_garcia_longoria           0 Sep 24 17:42 aligned_reads.fastq

As you can see the .fastq file is completly empty and it's the one that I want to use for for doing my next step.

What am I missing?

rna-seq assembly • 379 views
ADD COMMENTlink modified 5 months ago by ATpoint14k • written 5 months ago by luzglongoria20

Your fastq is empty, so it seems you somehow corrupted your data after the alignment was performed, since that file seems pretty big. Can you regenerate the fastq file?

You can also convert the bam back to fastq using samtools fastq myaln.bam > myreads.fastq

ADD REPLYlink modified 5 months ago • written 5 months ago by WouterDeCoster37k

Ok. I though also to do this but I wonder if I am going to alter somehow the BAM files and then the fastq file will be the "real" one...

ADD REPLYlink written 5 months ago by luzglongoria20

Is your command pasted here properly? The ` begins after the -1 but does not end at the right place.

ADD REPLYlink modified 5 months ago • written 5 months ago by RamRS20k

-1 `/home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq

You have to re-write the path for all your fq or launch your script from your workspace

Same for option -2

ADD REPLYlink modified 5 months ago • written 5 months ago by Bastien Hervé3.7k

What if OP used shell expansion?

-1 /home/luz_garcia_longoria/workspace/{s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq}
ADD REPLYlink written 5 months ago by RamRS20k

Ofc it is an elegant solution but brackets are missing from OP command line

ADD REPLYlink written 5 months ago by Bastien Hervé3.7k

the command is

bowtie2 --threads 4 --local --no-unal -x /home/luz_garcia_longoria/workspace/reference_genomes/parasitereference.fasta \
-q -k 1 --al aligned_reads.fastq -1 
/home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq
-2
/home/luz_garcia_longoria/workspace/s21_2.fq,s22_2.fq,s23_2.fq,s24_2.fq,s25_2.fq,s31_2.fq,s32_2.fq,s33_2.fq,s34_2.fq,s35_2.fq 
> aligned_host_parasite.sam \
 | samtools view -b -o aligned_host_parasite.bam
ADD REPLYlink modified 5 months ago • written 5 months ago by luzglongoria20

And I am running this command in workspace :)

ADD REPLYlink written 5 months ago by luzglongoria20

If you're running your command from /home/luz_garcia_longoria/workspace/ you don't need it in option -1 and -2

ADD REPLYlink written 5 months ago by Bastien Hervé3.7k
3
gravatar for ATpoint
5 months ago by
ATpoint14k
Germany
ATpoint14k wrote:

Ok, there is apparently some confusion here:

1) You are right now saving your alignment to > aligned_host_parasite.sam. The SAM format is the format that alignments are by default stored in. Using > redirects the stream from bowtie2 to the specified file (aligned_host_parasite.sam). If you do that, you (probably) cannot pipe it into samtools view, therefore the BAM is empty.

2) aligned_reads.fastq being empty is not surprising. From the bowtie2 manual:

--al <path>        write unpaired reads that aligned at least once to <path>

That means it will contain only files that are unpaired but aligned in any fashion. It being empty means you have no unpaired reads that aligned, probably because all reads were paired, which is expected from paired-end fastq files.

The correct command would be (removing the quotes around -1 and -2, which I think are not necessary):

$WORK_DIR="/home/luz_garcia_longoria/workspace"
bowtie2 --threads 4 --local --no-unal \
        -x ${WORK_DIR}/reference_genomes/parasitereference.fasta \
        -q -k 1 --al aligned_reads.fastq \
        -1 ${WORK_DIR}/s21_1.fq,${WORK_DIR}/s22_1.fq,\
           ${WORK_DIR}/s23_1.fq,${WORK_DIR}/s24_1.fq,\
           ${WORK_DIR}/s25_1.fq,${WORK_DIR}/s31_1.fq,\
           ${WORK_DIR}/s32_1.fq,${WORK_DIR}/s33_1.fq,\
           ${WORK_DIR}/s34_1.fq,${WORK_DIR}/s35_1.fq \
        -2 ${WORK_DIR}/s21_2.fq,${WORK_DIR}/s22_2.fq,\
           ${WORK_DIR}/s23_2.fq,${WORK_DIR}/s24_2.fq,\
           ${WORK_DIR}/s25_2.fq,${WORK_DIR}/s31_2.fq,\
           ${WORK_DIR}/s32_2.fq,${WORK_DIR}/s33_2.fq,\
           ${WORK_DIR}/s34_2.fq,${WORK_DIR}/s35_2.fq | \
        samtools view -b -o aligned_host_parasite.bam

By the way, SAM/BAM are the files you use for downstream analysis. Fastq is only a format for unaligned data. I hope you do not want to use fastq files for any downstream. What is your next step?

ADD COMMENTlink modified 5 months ago by RamRS20k • written 5 months ago by ATpoint14k

thank you for your useful comment!

My next step is Trinity and I think I need fastq files, isn't it?

As WouterDeCoster said I can also convert my sam files into fastq files but the thing is that I will have two SAM files (one that map with the parasite and other one that won't map either with the parasite or the host). So in order to run Trinity I want these two files together (to merge them) and then run Trinity.

I hope is clear

ADD REPLYlink written 5 months ago by luzglongoria20

Regardless of whether you need fastq for trinity, the bowtie command you posted above is wrong.

You cannot redirect to a file and keep on piping to other commands. Try the following stupid example

echo "foobar" > myfile.txt | echo

"foobar" will be in myfile.txt but the final echo doesn't do anything because after > nothing is send to |.

ADD REPLYlink written 5 months ago by WouterDeCoster37k

| echo doesn't do much, | cat is a better test.

Also, there are exceptions.

Exception

ADD REPLYlink modified 5 months ago • written 5 months ago by RamRS20k

Using > redirects the stream from bowtie2 to the specified file (aligned_host_parasite.sam). If you do that, you cannot pipe it into samtools view, therefore the BAM is empty.

While this is true in most cases and makes logical sense, there is a particular zsh feature called MULTIOS that makes the shell behave unexpectedly when it encounters a redirect followed by a pipe. It essentially becomes a tee at that point.

ADD REPLYlink written 5 months ago by RamRS20k

Good catch, I added a "probably" to my sentence ;-)

ADD REPLYlink written 5 months ago by ATpoint14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2178 users visited in the last hour