I am a molecular biologist in the first place, but came into touch with bioinformatic challenges more often recently. For simple tasks I've been using command line blast+ already and now I am wondering whether it is possible to do the following with blast+:
Let's say I have a random protein sequence in a file named RP1.faa and I want to check each bacterial strain in the refseq nr database on whether RP1 ist present or absent by an e-value threshold of e.g. 1e-05. And I want a table which shows me only the best hit for each strain in the first column and e.g. e-value, bit-score and so on in the following columns. And in each row I want the next strain to be shown, even if no blast hit was found In the end I am imagining a matrix which shows me whether a certain protein is present or absent.
Do you know if it is possible? Thanks a lot!