Question: How to know which reads were not aligned to a reference genome?
0
gravatar for genebow
22 months ago by
genebow150
USA/Chicago
genebow150 wrote:

Read pair-end fastq files for a bacterial strain can be aligned to reference genomes by BWA. Using bedtools, the coverage of the reads on reference genome can be computed using the option 'coverageBed'. I think the coverage means how many reads were aligned to each position of the reference genome. I have new question about the coverage analysis, how do we know which reads were not aligned on the reference genome? Because the sequenced genome may be larger than reference genomes, or has rearrangements or duplicated regions, some reads may not find corresponding regions in reference genome. Is there any tools that can find unaligned reads? Thanks!

snp sequence alignmet • 670 views
ADD COMMENTlink modified 22 months ago by bk1130 • written 22 months ago by genebow150
1
gravatar for bk11
22 months ago by
bk1130
bk1130 wrote:

You can use samtools for find unaligned reads.

https://davetang.org/wiki/tiki-index.php?page=SAMTools

samtools view -b -f 4 file.bam > unmapped.bam

samtools view -b -F 4 file.bam > mapped.bam

ADD COMMENTlink written 22 months ago by bk1130

Great, that is it! Thank you!

ADD REPLYlink written 22 months ago by genebow150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1684 users visited in the last hour