Running blastp with BLAST+ 2.15.0 against custom database; need to identify hits
1
0
Entering edit mode
6 months ago

Note: I am very new to bioinformatics!

I am on a Windows 11 machine using BLAST+ 2.15.0 to run blastp queries against a custom database of shotgun metagenomic data from this website: http://gigadb.org/dataset/100842

I am querying the 02_AnaerobicDigestion_GeneCatalog_gene.pep.fa file using blastp, and the results returned (to a .txt or .xml file) look like this:

blastp hits from metagenomic database

I want to know what bacterial strain/species is associated with each hit, but all the subjects have an AD_gene_#### identifier (from the metagenome sequencing) instead of any kind of species/strain identifier.

I know that I should be able to collect protein sequences from the blastp results into a file, but I do not know how to do this.

I would then need to blastp these sequences against the non-redundant protein database and write a file that contains information about the taxonomy of the the top blastp hit.

I don't need the amino acid sequence at that point, but just some kind of strain identifier that I can use to create a list of bacterial "species."

In summary, I want a list of bacterial species that contain a homolog of a protein of interest from a shotgun metagenome dataset.

I'm not sure how to get the output that I'm looking for and would appreciate any help!

shotgun metagenomics blastp taxonomy • 487 views
ADD COMMENT
2
Entering edit mode
6 months ago
GenoMax 147k

I know that I should be able to collect protein sequences from the blastp results into a file, but I do not know how to do this.

You can do that by extracting the sequences you need from the custom database using blastdbcmd utility included in blast+. See help: https://www.ncbi.nlm.nih.gov/books/NBK569853/

As for the rest of the analysis it would be better if you use an easily parsable format for blast output, when you do the blast against nr. Look into -outfmt 6 for this purpose. https://www.metagenomics.wiki/tools/blast/blastn-output-format-6

ADD COMMENT

Login before adding your answer.

Traffic: 861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6