I am using a metaproteomic database for the gut microbiome of mice which I found online http://gigadb.org/dataset/view/id/100114/token/mZlMYJIF04LshpgP.
Unfortunately, the accession numbers and protein descriptions are not really helpful for taxonomic analyses since they are like this:
S-Fe7_GL0014216 [gene] locus=scaffold66956_1:1:1053:+ [Lack both ends] codon-table.11
There is no pattern in the accession in terms of taxonomy and the database is too big for my excel.
The owners also included a text file with explanations of each accessions e.g.
S-Fe7_GL0014216 1/1 Clostridiales order root|cellular organisms|Bacteria|Firmicutes|Clostridia|Clostridiales no rank|no rank|superkingdom|phylum|class|order
I am wondering if there is joint command or script to loop through the file and replace the matching accession with the actual species description?
Thanks in advance for your help!