Best strategy for matching OTU sequences from different runs?
Entering edit mode
6.9 years ago


I was wondering if somebody could please help me solve this problem: I have several different fasta files, all containing 16S sequences (gut microbes), and I would like to determine how OTUs compare across samples; for example, is OTU01 in sample X, the same as OTU01 in samples Y, W and Z? Ideally I would like to combine all of these tables in one single matrix, with OTU and sample as variables, and OTU count as a response. The problem is that there are multiple fasta files, and they refer to sequences from different individuals, different runs and from different sequencing platforms (454 and Illumina).

I could probably do this manually, by generating a file with only representative sequences and the number of sequences per OTU for each of these samples (using Mothur) and then I could manually match sequences that are similar and give them the same name (e.g. "OTU01"). However, I believe there must be an easier and quicker way of accomplishing this task. If you have done anything similar to this, please let me know what tools or strategy you used. I appreciate your help!

OTU 454 Illumina Mothur • 1.8k views

Login before adding your answer.

Traffic: 2244 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6