Hey there,
I' am currently struggling with the selection of the right matrix and some thoughts from you would be very helpful. As I' am a guy from the ancient DNA field the samples that I have to deal with contain mostly of short reads ranging from 40 to 150 bases. And I'am interested in finding reads of Viral origin (if thats a good Idea? Who knows?)
So what I get from this paper: Selecting the Right Similarity-Scoring Matrix
is that I should use a "shallow" Matrix like a PAM30 for my analysis. The paper states that the matrix is more informative because the match and mismatch scores differ a lot which would result in really significant alignments. But is this the way to go?