Entering edit mode
2.4 years ago
Etienne
•
0
Hi, I did a bowtie of the SRR3990725 sequences against the MT895507 genome and got the SAM / BAM tables and the stats:
2060454 reads; of these: 2060454 (100.00%) were paired; of these: 2060452 (100.00%) aligned concordantly 0 times 1 (0.00%) aligned concordantly exactly 1 time
1 (0.00%) aligned concordantly >1 times
----
2060452 pairs aligned concordantly 0 times; of these:
0 (0.00%) aligned discordantly 1 time
----
2060452 pairs aligned 0 times concordantly or discordantly; of these:
4120904 mates make up the pairs; of these:
4115130 (99.86%) aligned 0 times
1877 (0.05%) aligned exactly 1 time
3897 (0.09%) aligned >1 times
0.14% overall alignment rate
but what I want to know is what percentage identity of the 0.14% of sequences that lined up. how can i know this?
You may be able to do this using the SAM files as described here: https://zombieprocess.wordpress.com/2013/05/21/calculating-percent-identity-from-sam-files/ It depends on parsing the MD field (which you may have to add using samtools). You'll also have to write a parser to deconstruct this field.
Depending on why you need the answer, you could also simply extract some of the mapped reads, and explore the percent identity for those using a different alignment method.