Question: BLAST search with protein query in assembly file with .fna extension.
gravatar for biophoenix
4.8 years ago by
biophoenix0 wrote:

I am complete newbie in bioinformatics and not friendly with command line applications. Recently I stumbled in a problem with BLAST search of protein query in genomes via tblastn. The online BLAST-server does not have all needed genomes, so I decided to run search in my computer. I downloaded genomes of different organisms in archives, unzipped them and renamed for convenience. All files are initially in *.fna or *.fasta extensions were saved as *.fna for convenience as in NCBI databases and were sorted with taxonomic position as criteria into different folders. Which free available program with GUI can be used for tblastn search of protein homologs inside downloaded genomes, and what special tips to pay attention. One of the problems I encounter when trying to use UGENE is that it ask to make database and then starts to pour errors.

blast genome • 1.6k views
ADD COMMENTlink written 4.8 years ago by biophoenix0

I don't know of any free programs that would allow GUI based tblastn searches of locally stored genome files. With any blast search using a custom genome you are going to have to make the blast databases. You could use blat to do the searches but again that will have to be done on the command line.
Any chance you are interesting in getting your hands dirty and learning some command line searching?

ADD REPLYlink written 4.8 years ago by GenoMax95k

Well, I am trying now the BLAST+ standalone in win7 64bit. How can I easily generate blast databases for each assembly file (names has pattern TIDnumbers_NNN_nunbers.number_GENUS_SPECIES_strain.fna) with one command. I have nearly 700 genomes with 10+ mb in size each. Also, how can I make aggregate databases with common genus name into one. Also, I noticed, that when using UGENE which has integrated BLAST+ makeblastdb and search options generates errors when searching saying that there is space character in folder name for temporary files. I found that it is so, but the space character is in master folder name for document of the user, as username contains a space. Renaming account name via control panel did not change the folder name and it is blocked for manual F2 rename approach.

ADD REPLYlink written 4.8 years ago by biophoenix0

Is there any way you can get access to a unix computer/a VM running unix on your windows machine/a local unix person who can help? It would be so much easier to work in unix.
The space in the file name could be worked around by escaping the space character. You may be able to use powershell to write batch scripts to do the indexing but considering the size of the dataset you have an uphill battle ahead of you.

ADD REPLYlink written 4.8 years ago by GenoMax95k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1082 users visited in the last hour