Extracting percent identity of mapped reads and the coordinates on the genome the read mapped from a BAM file?
1
0
Entering edit mode
3.0 years ago
Eric • 0

Hello,

I am working with bam files that have environmental reads mapped back to an isolate genome. I am looking for a way to extract both the percent identity of all mapped reads in the BAM file and the coordinates on the genome where they mapped. I was looking at some combination of maybe bedtools and pysam or parsing the CIGAR string but nothing I have tried has worked so far.

I was trying to find a pythonic answer, but if people have other solutions feel free to share.

Thanks!

Bedtools PySam python BAM Samtools • 1.3k views
ADD COMMENT
1
Entering edit mode
3.0 years ago

If you use pysam:

samfile = pysam.AlignmentFile("your_bam", "rb")
for read in samfile.fetch():
     cig=read.cigarstring #for cigarstring
     aligned=read.get_aligned_pairs() # will give you the aligned pairs

Be careful 'M' in cigar also include SNPs

ADD COMMENT
0
Entering edit mode

Thanks for this example, its very helpful being new to pysam!

ADD REPLY

Login before adding your answer.

Traffic: 2649 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6