Question: Database annotation for metaproteomics
1
gravatar for marlenejensen
16 days ago by
marlenejensen20 wrote:

Hey all,

I am using a metaproteomic database for the gut microbiome of mice which I found online http://gigadb.org/dataset/view/id/100114/token/mZlMYJIF04LshpgP.

Unfortunately, the accession numbers and protein descriptions are not really helpful for taxonomic analyses since they are like this:

S-Fe7_GL0014216    [gene]  locus=scaffold66956_1:1:1053:+ [Lack both ends] codon-table.11

There is no pattern in the accession in terms of taxonomy and the database is too big for my excel.

The owners also included a text file with explanations of each accessions e.g.

S-Fe7_GL0014216    1/1    Clostridiales    order    root|cellular organisms|Bacteria|Firmicutes|Clostridia|Clostridiales    no rank|no rank|superkingdom|phylum|class|order

I am wondering if there is joint command or script to loop through the file and replace the matching accession with the actual species description?

Thanks in advance for your help!

Cheers

next-gen • 67 views
ADD COMMENTlink modified 16 days ago by RamRS21k • written 16 days ago by marlenejensen20

the database is too big for my excel

via t@tim_yates

Time to switch to R/Python :-)

ADD REPLYlink modified 16 days ago • written 16 days ago by RamRS21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 811 users visited in the last hour