Question: How To "Assemble" Proteins Using Amino Acids, How Do I Align Fragments Of A Protein To Get The Best Overall Protein?
0
gravatar for Gimly_Gloin
7.5 years ago by
Gimly_Gloin70
Gimly_Gloin70 wrote:

I've just been attempting a frameshift correction of my DNA sequences using fasty36. the output is a set of protein sequences "hopefully" in the correct frame. I have a huge amount of duplicates, i've tried to reduce them by clustering, which has worked somewhat, but I still have a large number of sequences where the hits to my reference sequence give me different translations for the same read. (My reference; the protein database I used seems to have it's own ambiguities/frame-shifts that change the protein sequence in places, although I could scrutinize this I don't want to, because these may be biologically significant.)

I want to align each read (now that I have many possibilities for the same read) against each other AND choose the best fit for the reference sequences (well over 2000 sequences for the MSA). SO... I was wondering if there is a way to align them end to end, similar to a DNA assembly?

In fasty if I adjust the e-value to be very small, I loose a lot of sequences.

So far i've done:

1) fasty36 run of my DNA-reads against a Protein reference sequence

2) extracted the "corrected" sequences with "/", "\","*","-"...

3) removed those chars for the next step

4) Run a fasta36 of my Translated DNA against my Protein reference sequence using a high e-value to collect the "perfect" alignments(<--- probably an unneeded step)

Thoughts advice, and alternative solutions/suggestions much appreciated....

ADD COMMENTlink modified 5.9 years ago by Biostar ♦♦ 20 • written 7.5 years ago by Gimly_Gloin70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1691 users visited in the last hour