Making custom database (like green genes) for QIIME Closed-reference OTU picking
Entering edit mode
9.3 years ago
arnstrm ★ 1.8k

I have a special case where I want to use my custom 16s data (already classified to respective species) to be used with QIIME. Does anyone know how can I create a greengenes like database for my data (to be used as Closed-reference OTU picking pipeline)?

Thanks very much for any help you can offer!

qiime green genes database • 6.1k views
Entering edit mode
9.3 years ago
5heikki 11k

Because Greengenes is rather limited with Archaea, I recently made a QIIME compatible version of SILVA 119 nr99.

You just need a taxonomy mapping file ID<TAB>kingdom; phylum; class; order; family;genus (note 6 levels if you use RDP classifier, for other assignment methods maybe all the way down to species level).

ID1    k__Archaea; p__Euryarchaeota; c__; o__; f__; g__
ID2    k__Archaea; p__Euryarchaeota; c__; o__; f__; g__
ID3    k__Archaea; p__Euryarchaeota; c__; o__; f__; g__

And then your sequences with corresponding headers, in this case ID1, ID2, ID3. For some QIIME analyses you would also need alignment of your sequences and a tree. I made my reference with a few awk one-liners and GNU coreutils (basically just parsed the taxonomy info from the sequence headers and made sure that it was down to genus level in all cases with just "g__;" representing missing genus info and so forth).

Entering edit mode

Thanks very much for the reply! Did you create a complete database (with alignments, trees etc) or just the fasta sequence and mapping file? It would be great if you could give a general overview of steps to achieve this!

Entering edit mode

I didn't need alignments or trees so I didn't bother. With SILVA, I could have just grabbed the alignment from their site though. I believe with Greengenes they used ssu-align for the alignment and FastTree for tree building. All the info is in the README files..

Entering edit mode

Thank you for pointing out the readme file. I see they have brief methods section in there. I'll try to replicate that!


Login before adding your answer.

Traffic: 1928 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6