My aim is to find out correlated mutations within a single paired reads. For example, I need to know if the sequence ID X, that has mutation at position lets say 800, also has a mutation at position at 1100. So I managed to get bam and sam files containing only reads that span the regions I am interested in. I have the fasta sequences and I used Translator X to translate those into protein fasta.
Now I know what I was expecting to get back and when I loaded these into Clustal Omega to get an alignment. This doesnt work that well. There are gaps and sequenced that were just badly translated. I looked at the badly translated sequences in the fasta file I get from the Translator X and they are already there. When I looked at the nucleotide fasta, these are fine. Is there a way I can feed my reference sequence into an alignment tool so I can get the protein sequences translated and aligned correctly?
Does anybody have any experience with this type of analysis?