I am newbie in bioinformatics and I am having some difficult to extract a subset of reads in a .bam file. I have a set of reads (Illumina HiSeq4000, majority of the reads with length=125 bp) that I generated from laser captured chromosomes and I am performing an read filtering step to remove reads that could be contaminants from bacterial genomes. First I think to use samtools to extract the unmapped reads first such as:
samtools view -b -f 4 mapping.bam > unmapped_reads.bam
However, in preliminary read mapping against few bacterial genomes I have some segments with <=30 bp of alignment. Is there a way to extract both together: (a) unmapped reads + (b) reads with alignment size <= 30 bp?
Thanks in advance guys, and sorry if it is a too basic question, but I searched in the forum and I did not found an answer.
Best Regards Kaleb Gatto