Question: Database annotation for metaproteomics
1
gravatar for marlenejensen
12 months ago by
marlenejensen20 wrote:

Hey all,

I am using a metaproteomic database for the gut microbiome of mice which I found online http://gigadb.org/dataset/view/id/100114/token/mZlMYJIF04LshpgP.

Unfortunately, the accession numbers and protein descriptions are not really helpful for taxonomic analyses since they are like this:

S-Fe7_GL0014216    [gene]  locus=scaffold66956_1:1:1053:+ [Lack both ends] codon-table.11

There is no pattern in the accession in terms of taxonomy and the database is too big for my excel.

The owners also included a text file with explanations of each accessions e.g.

S-Fe7_GL0014216    1/1    Clostridiales    order    root|cellular organisms|Bacteria|Firmicutes|Clostridia|Clostridiales    no rank|no rank|superkingdom|phylum|class|order

I am wondering if there is joint command or script to loop through the file and replace the matching accession with the actual species description?

Thanks in advance for your help!

Cheers

next-gen • 239 views
ADD COMMENTlink modified 12 months ago by RamRS27k • written 12 months ago by marlenejensen20

the database is too big for my excel

via t@tim_yates

Time to switch to R/Python :-)

ADD REPLYlink modified 12 months ago • written 12 months ago by RamRS27k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2349 users visited in the last hour