blastall vs bl2seq, and blastall fasta output
0
0
Entering edit mode
5.1 years ago
thomas.welch ▴ 50

Hi there, I hope one of you guys can help me with this quite basic question.

First of all i am trying to pick out homologs for phylogenetic analysis using standalone blast in the unix terminal. i am using a single gene in fasta format as query against a downloaded genome formatted with formatdb. I get the output i want with this but is there a way i can get my highest scoring hit in a fasta file output?

Secondly when i conduct the same search using the bl2seq command i get very different outputs, with hits much smaller in length (and clearly noise), and when i apply the same e-value constraints as i use with my blastall search, no hits at all.

blast blastall bl2seq blast+ fasta • 1.5k views
ADD COMMENT
1
Entering edit mode

You can use the blastdbcmd utility to retrieve fasta formatted sequences from the hits you are interested in: NCBI Blast locally: filter by accession number and NOT by GI number You may have to go to a tabular output format/parse out accession numbers you need.

Blast results depend heavily on the size of the database being searched against. A regular blast search and blasting two sequences against each other are significantly different.

ADD REPLY
0
Entering edit mode

thank you for your help. however unfortunately this command does not give me the hit. the genome i have called formatdb on is a fasta file of shotgun sequence runs for a whole genome. while blastdbcmd gives me a fasta file of the accession which contains the hit, it does not give me the hit itself.

ADD REPLY
1
Entering edit mode

Since you are now providing this additional information the solution will change. You can convert the blast "hit" coordinates into BED format (chr, start, top) and then use bedtools getfasta to retrieve the sequences you need.

ADD REPLY
0
Entering edit mode

thank you, but i don't think this will work either. the file for the organism i am investigating does not have chromosome coordinates (although there are of course query start and stop coordinates). it looks like i will have simply make a script to extract the correct (query) lines from the blastall output file, and then stick them together.

ADD REPLY
1
Entering edit mode
`chr = whatever_name_you_have_for_subject` in this case
ADD REPLY
0
Entering edit mode

got it. this works perfectly. thank you very much.

ADD REPLY
0
Entering edit mode

Any particular reason why you are running blastall over blast+? Blastall is the legacy version of blast

ADD REPLY

Login before adding your answer.

Traffic: 2453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6