Question: Alignment as BLAST-able database/other alignment avenues
0
gravatar for jnf3769
3.2 years ago by
jnf376940
United States
jnf376940 wrote:

Hey all,

I was wondering if there was a way to use an alignment as a local blast database. My problem is as follows: I have an old alignment of concatenated protein sequences that have CHARSET definitions defining the protein beginnings and endings at the bottom of the NEXUS file. Some folks doing method development type stuff created a method that does a different sort of alignment and they also removed some of the sequences from the alignment, changing the length of the overall alignment. Thus, the CHARSET definitions no longer delimit the genes. But I need them to for a downstream step. I have all the protein sequences, both as they were in their old alignment and also just the sequences themselves. My natural thought was to make a BLAST database out of the alignment and query the individual sequences to get an answer. But I don't seem to be able to make such a local database the old fashioned way. Is there a way with BLAST?

Barring that, I think MAFFT has experimental 'addsequences' and 'addfragments' functionality. But they are terribly slow and I don't know if that's appropriate for my goal. Anybody have any insight? Is MAFFT a reasonable tool for this? Is there a BLAST method? Perhaps a more traditional Comp Sci string based distance minimization approach comes to mind? I really appreciate any help/insight you all can offer. Ideally, I'd do the alignment traditionally, but, like I said, this is downstream of some folks working on some algorithmic method development--they haven't implemented a way to keep track of this stuff quite yet.

Best!

maftt blast nexus alignment • 744 views
ADD COMMENTlink written 3.2 years ago by jnf376940

But I don't seem to be able to make such a local database the old fashioned way.

makeblastdb is the current way of making local blast databases (run the command with -help flag to get inline help). If you are familiar with BLAST you should be able to pick up the differences and run with BLAST+ easily.

ADD REPLYlink written 3.2 years ago by genomax92k

Well, that is what I consider the old fashioned way. But in any case, you've completely missed the spirit of my question. Can't make a blast database with an alignment that way--the help flag doesn't specify such a method, at least.

https://ibb.co/cSLNKF https://ibb.co/kmcf6v

ADD REPLYlink written 3.2 years ago by jnf376940

If you want to make a blast database out of an alignment you will need to take out the gaps from that fasta file. You could easily use sed to replace the - with nothing. If you wish to preserve the alignment then you will have to use the add sequence/add fragment method with a multiple sequence alignment program as you noted above.

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by genomax92k

While I thank you for your response, you again are missing the spirit of the question. The gaps are necessary. It would not be an alignment if I removed the gaps, nor would it inform me to the delineations of the protein sequences in the alignment.

ADD REPLYlink written 3.2 years ago by jnf376940

See the modified comment above. Someone else may be along with a new comment/answer.

ADD REPLYlink modified 3.2 years ago • written 3.2 years ago by genomax92k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1621 users visited in the last hour