Entering edit mode
12.1 years ago
User 8663
•
0
hello I have an alignment done in colourspace, in the SAM format, and one of the statistics that I need to get out is length of aligned read and the number of mismatches. While the length of aligned read is easy, I don't have the tag "NM", only the "CM". I use pysam to parse the files. The problem is that, when I received the alignment, I also received a statistic file here the maximum number of mismatches are 10, but if I use the "CM" tag I get 11. Also, if I compare the sequence i get with read.seq with the one in the reference assembly, i don't get "CM" number of mismatches.
thanks in advance
can you post an example? the number of colorspace mismatches (CM) is not the same as the number of base-space mismatches (comparing sequence with read.seq)
Of course, i got a quick one CM = 1 seq = AGAACAAAGTGCCTGTGTACCTAAGAAAGTTCCTGTCTCTCTAAGAGTCA pos = mouse chr 6:41071878 reverse 50bp the sequence in ensemble is equal!!