How to Return Non-Aligned Proteins - BLAST
1
0
Entering edit mode
21 months ago

How can I get the output file of BLAST to be all the sequences that didnt aligned? for example, I have a protein and i blast it against a dataset (with like, 200 proteins) and blast returns 3 proteins in an output file. How can I make it return the others 197 proteins?

blast • 428 views
1
Entering edit mode

If you don't know how to use the command line then this analysis is going to be an uphill task. It can be done but will need you to put effort into it. Here is a pretty basic intro from NCBI on how to use blast+ programs on the command line. What OS are you using on your laptop?

Broad outline for the analysis would look like this:
1. You will create a blast database (makeblastdb) from the data you downloaded.
2. This will be followed by using your query(ies) against this db. You will want to save the results in blast+ outfmt 7. This allows queries that have not "hits" in database to show up in the output. Sounds like you are interested in those.
3. You will finally parse out these query identifiers that do not have hits from results.
4. You can then pick out those queries from your initial set.

0
Entering edit mode

Ok, so, i tried this and thats not what i want.. For example, lets suppose i have a fasta with only one protein. I wanna search this protein in a dataset, like, with an evalue of 10e-6. I want blast to return to me every protein that didnt match

0
Entering edit mode

What you are asking is non-trivial, and you are not helping yourself by admitting how little you have done and how little you know about this subject. Imagine that you are a great painter, and I come to you armed with a nice canvas and quality colors. I have zero knowledge about painting and I ask you to teach me and spare no details. That's how your request feels to me - not that I am a great painter or great anything :-)

Maybe you could start by investing some effort on your own. BLAST is one of the most widely used programs out there and it is very easy to find information about it - I will point you here to get you going.

0
Entering edit mode

i've running mad trying to understand how to do blast cause i have no experience with it or linux or everything. I read right now some guides that helped me out at some stuff, but i come to nowhere

0
Entering edit mode
21 months ago

taking into account your comments above you are looking for which proteins form your database are not a potential hit for any of your three input proteins. I can think of a few ways to do so but given that you are clearly (as mentioned in your above comments) not proficient in cmdline bash porcessing, they are likely not useful options for you.

The only clear option I see for your problem is to reverse your blast setup. What you now have been using as input (the 3 proteins here) use that as the DB and use what is your DB as input for the blast. Then following @genomax's guidelines run the blast with outfmt 7 which will report you which sequences do not match.