blastall vs bl2seq, and blastall fasta output
0
0
Entering edit mode
5.9 years ago
thomas.welch ▴ 50

Hi there, I hope one of you guys can help me with this quite basic question.

First of all i am trying to pick out homologs for phylogenetic analysis using standalone blast in the unix terminal. i am using a single gene in fasta format as query against a downloaded genome formatted with formatdb. I get the output i want with this but is there a way i can get my highest scoring hit in a fasta file output?

Secondly when i conduct the same search using the bl2seq command i get very different outputs, with hits much smaller in length (and clearly noise), and when i apply the same e-value constraints as i use with my blastall search, no hits at all.

blast blastall bl2seq blast+ fasta • 1.7k views
1
Entering edit mode

You can use the blastdbcmd utility to retrieve fasta formatted sequences from the hits you are interested in: NCBI Blast locally: filter by accession number and NOT by GI number You may have to go to a tabular output format/parse out accession numbers you need.

Blast results depend heavily on the size of the database being searched against. A regular blast search and blasting two sequences against each other are significantly different.

0
Entering edit mode

thank you for your help. however unfortunately this command does not give me the hit. the genome i have called formatdb on is a fasta file of shotgun sequence runs for a whole genome. while blastdbcmd gives me a fasta file of the accession which contains the hit, it does not give me the hit itself.

1
Entering edit mode

Since you are now providing this additional information the solution will change. You can convert the blast "hit" coordinates into BED format (chr, start, top) and then use bedtools getfasta to retrieve the sequences you need.

0
Entering edit mode

thank you, but i don't think this will work either. the file for the organism i am investigating does not have chromosome coordinates (although there are of course query start and stop coordinates). it looks like i will have simply make a script to extract the correct (query) lines from the blastall output file, and then stick them together.

1
Entering edit mode
chr = whatever_name_you_have_for_subject in this case

0
Entering edit mode

got it. this works perfectly. thank you very much.

0
Entering edit mode

Any particular reason why you are running blastall over blast+? Blastall is the legacy version of blast