I'm working on the assembly of a chloroplast genome. I have a bam file and I wanted to:
- Remove pairs that did not map to the reference genome
- Keep reads that are mapped as proper pairs
- Keep reads in a pair where one read mapped but the other did not (I want to include the unmapped read if its pair mapped properly)
I have been reading about the flags and my teacher and I did the following steps:
samtools fastq -f 3 -1 paired_1.fastq -2 paired_2.fastq zm_cp.bam samtools fastq -f 9 -1 unmapped_1.fastq -2 unmapped_2.fastq zm_cp.bam samtools fastq -f 5 -1 unmapped_t1.fastq -2 unmapped_t2.fastq zm_cp.bam cat unmapped_1.fastq unmapped_t1.fastq > tmp1.fastq cat unmapped_t2.fastq unmapped_2.fastq > tmp2.fastq cat paired_1.fastq tmp1.fastq > chloroplast_reads_1.fastq cat paired_2.fastq tmp2.fastq > chloroplast_reads_2.fastq
I wanted to be sure if this workflow is appropiate. I'll be very happy to read some advice :)
Thanks a lot in advance!