I can separate unaligned reads through a much convoluted process mentioned here( https://github.com/alvaralmstedt/Tutorials/wiki/Separating-mapped-and-unmapped-reads-from-libraries), but is there a easier way to separate unaligned reads in fastq format using bowtie? Would appreciate if someone could guide me through the process of separating both single end reads and paired end reads in fastq format.
Did you look at the
botwie2) manual's relevant section?
If you used
bbmap.sh to align the reads then
outu=file.fq will collect unaligned reads.
If you have aligned bam files containing unaliged reads then you can easily separate them by
samtools view -f 4 file.bam > unmapped.sam
I think the simplest solution is using SAMBLASTER. It is actually a tool for marking duplicates and extracting split/discordant reads for structural variant analysis, but has also the option to output unmapped reads as fastq. To make the tool only outputting the unmapped reads without any further manipulation of the bowtie output, I would do:
bowtie --sam (...) | samblaster -a --ignoreUnmated -u reads_unmapped.fastq --quiet | samtools view -b -o alignment.bam
-a turns off the duplicate detection and
--ignoreUnmated turns off the detection of unmated reads.
alignment.bam is then your bowtie output in BAM format. You can also directly pipe the whole thing into
samtools sort to save disk space and time.
I tried it saving as fastq using bowtie and that does the job done. So if you save the output as fastq, it loses the SAM features and saves only the aligned reads.
Here is what I have done:
bowtie -q -p 18 -v 1 index_out infile.fastq --un unaligned_output.fastq --al aligned_output.fastq
This gives you both aligned and unaligned reads. The
index_out is the bowtie index file from
bowtie-build without extension.