I have reads from Illumina HiSeq, paired end, 126 bp each reads (after adaptor trimming) from metagenomic samples. reads are post filtration against human genome (hg19). I want to filter these reads against all known bacterial/fungal/viral genomes. I doing this step by step, each time outputting the un-aligned reads (using the --un-conc flag). I use --very-sensitive Bowtie2. at the end of the process I get reads that still can be mapped to viral/bacterial databases. It seems like the filtration is inconsistent and only portion of the reads are being filtered out, according to there abundance in the library.
Does anyone have an idea why this is happening and how to solve it?
reads some threads here, I though that maybe is because one mate-pair aligns to a database and the other does not, so the paired end reads is passing as unaligned.
I would be happy to hear any thoughts and solutions about this.