I use bwa mem with my paired-ended files and mixed reference genome (concatenate by hg19 and mm10) as the following command:
bwa mem -t 4 mixed_human_mouse.fa \
rep1.R1.decomplex.fastq.gz \
rep1.R2.decomplex.fastq.gz \
| samtools view -bS - > p56.rep1.bam
However, I get mapping in different species (both human and mouse) or different chromosomes in the several reads. How can I get the unique reads (with just one chromosome and one species in the single read)? The mapping sam file is shown below: (Human5 means the chromosome 5 in hg19)
CGGCTATGCGTACTATTCTCTCCGCCTATCCT:M01581:1209:000000000-D3YJT:1:1102:15099:1447 129 Mouse10 72297312 15 44M MouseM 2300 0 TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCTCTTTTTTTT 111>11110>0>///////>//>///<///<</0111<111/-- NM:i:0 MD:Z:44 AS:i:44 XS:i:40
ATTACTCGTAGGCATGCGTCTAATTTAGAGGC:M01581:1209:000000000-D3YJT:1:1102:15159:1448 81 Human4 31583390 60 44M Mouse5 81894906 0 ATCCCATAAAAATATTTATCATTAGATTTAGCACATACCTGTAG GA4FHHGHHHHHGGG5GFFBGFGFGFFGFGFCFFFFFFC3A3>3 NM:i:1 MD:Z:43T0 AS:i:43 XS:i:21
Simon
Thanks a lot! I'll look into it. How about the reads with different chromosomes?
You will have to decide what to do with the multi-mappers. BBMap give you multiple options (look at
ambig=
).Thanks for your help! I successfully finish it. However, I don't know how to access to mapping statistics and check the quality. Do you have any suggestion?
BBtools produce stats in the STDERR by default. Did you capture that output? If not, you should be able to use
reformat.sh
on your BAM file to produce stats (histogram options). Otherwise Qualimap orsamtools idxstats
would work as well.