Database annotation for metaproteomics
0
1
Entering edit mode
5.0 years ago

Hey all,

I am using a metaproteomic database for the gut microbiome of mice which I found online http://gigadb.org/dataset/view/id/100114/token/mZlMYJIF04LshpgP.

Unfortunately, the accession numbers and protein descriptions are not really helpful for taxonomic analyses since they are like this:

S-Fe7_GL0014216    [gene]  locus=scaffold66956_1:1:1053:+ [Lack both ends] codon-table.11

There is no pattern in the accession in terms of taxonomy and the database is too big for my excel.

The owners also included a text file with explanations of each accessions e.g.

S-Fe7_GL0014216    1/1    Clostridiales    order    root|cellular organisms|Bacteria|Firmicutes|Clostridia|Clostridiales    no rank|no rank|superkingdom|phylum|class|order

I am wondering if there is joint command or script to loop through the file and replace the matching accession with the actual species description?

Thanks in advance for your help!

Cheers

next-gen • 714 views
ADD COMMENT
0
Entering edit mode

the database is too big for my excel

via t@tim_yates

Time to switch to R/Python :-)

ADD REPLY

Login before adding your answer.

Traffic: 1662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6