Hello, I've downloaded what should be an abundance matrix file directly from MG-RAST website (specifically, this file for metagenome ID mgm4599960.3- 700.annotation.sims.filter.seq. I've compared it to the assembeled.fasta file of the same project, and indeed the contigs names listed in the .fasta file are the ones showing up in the abundance file. However, I cannot understand what each column stands for. The first one is the contig number, the second one is the m5 id, and the third one is probably the identity% of the contig, but what are the abundance columns? (the 'counts' of the reads mapped to the contig)? This is the head of the .annotations.sims.filter.seq file:
> contig_1025862_1_294_- 000002aa15832b94a71e3c7de643c267 63.77 69 25 0 3 71 89 157 2.20E-21 100
> contig_1110905_1_353_- 00001aba8aee0c90a80969ea8da059f8 60.53 114 45 0 4 117 156 269 5.40E-33 139
> contig_675745_1_245_- 00001aba8aee0c90a80969ea8da059f8 66.67 75 25 0 6 80 7 81 7.40E-20 95
Thank you! Barak