Question

Question about sequence alignment

0

Entering edit mode

15 months ago

Bad10 • 0

I have a protein FASTA sequence file from a specific reference strain. I would like to compare this sequence to the sequences of ~150 other strains of the same species. Does anyone have any suggestions for how to achieve this? I already tried BLAST but it does not seem to pull from many of the strains I am interested in.

I have already downloaded the ~150 proteome files for these strains. So now I just need to figure out how to align a single protein sequence to 150 full proteome files. If there are any easy to use tools like BLAST or MUSCLE that let you essentially create your own database to align to, that would be ideal. That being said, I don't mind learning a bit of R if that would be the best way to approach this problem. Apologies in advance, I'm really a novice when it comes to this but would immensely appreciate any help.

sequence alignment • 648 views

ADD COMMENT • link 15 months ago by Bad10 • 0

0

Entering edit mode

You can do taxonomic restriction with blast if you know the TaxIDs of all the strains you're interested in, but it may not be the most efficient solution.

ADD REPLY • link 15 months ago by Joe 21k

score 0 · Answer 1 · 2023-07-03

0

Entering edit mode

15 months ago

shenwei356 8.6k

I already tried BLAST but it does not seem to pull from many of the strains I am interested in.

blastp should be the most sensitive approach. If there is no hit, it might be true.

I just need to figure out how to align a single protein sequence to 150 full proteome files.

Concatenate all fasta files to a single file and create a blastp database.

ADD COMMENT • link 15 months ago by shenwei356 8.6k

0

Entering edit mode

Thank you, I was not aware you could create your own blastp database from FASTA sequences. That is exactly what I was looking for.

ADD REPLY • link 15 months ago by Bad10 • 0