Hello all,
I am using PLAST (which is some kind of faster BLAST) against the whole NCBi protein database to detect some unknown contamination in my fasta.file.
As a result I have a table with 10 000 lines and 3 columns : the id of the contig (from my fasta), the hit accession and the hit id (gi|219129610|ref|XP_002184977.1| for instance ) .
I would like to know for each line if the blast match is in Rhodophyta, in order to delete the contig or keep it.
To do this I first plan to add tomy table a new 4th column with the phylum taxid associated with the hit accession. But I don't know how to do that. I would be happy if somebdy helps me to do this or knows a better way to achieve my goal.
I've looked BBMAP (How to obtain taxonomic information from GIs ) but it requires the database ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid_prot.dmp.gz that have been deleted in 2019 ...
Thank you again for your help !