Question

Columns meaning in MG-RAST abundance matrix file

0

Entering edit mode

4.3 years ago

barakdror • 0

Hello, I've downloaded what should be an abundance matrix file directly from MG-RAST website (specifically, this file for metagenome ID mgm4599960.3- 700.annotation.sims.filter.seq. I've compared it to the assembeled.fasta file of the same project, and indeed the contigs names listed in the .fasta file are the ones showing up in the abundance file. However, I cannot understand what each column stands for. The first one is the contig number, the second one is the m5 id, and the third one is probably the identity% of the contig, but what are the abundance columns? (the 'counts' of the reads mapped to the contig)? This is the head of the .annotations.sims.filter.seq file:

> contig_1025862_1_294_-    000002aa15832b94a71e3c7de643c267    63.77   69  25  0   3   71  89  157 2.20E-21    100
> contig_1110905_1_353_-    00001aba8aee0c90a80969ea8da059f8    60.53   114 45  0   4   117 156 269 5.40E-33    139
> contig_675745_1_245_- 00001aba8aee0c90a80969ea8da059f8    66.67   75  25  0   6   80  7   81  7.40E-20    95

Thank you! Barak

mg-rast metagenome • 598 views

ADD COMMENT • link 4.3 years ago by barakdror • 0