Question: 16S rRNA classification pipeline
gravatar for Sus
3 months ago by
Sus10 wrote:

Hello everyone,

I have 16S data and I'm trying to identify what genus/species etc. are in my samples and their relative abundance.

It's my first time working with 16S data and I'm also not used to classification, I'm trying to improve what I already did and understand some concepts.

As of today, to identify what's in my sample I've been relying on classifiers like Kraken or Centrifuge and the 16S databases like SILVA or GreenGenes. While they do give results that are fine on my test samples (I'm looking at genus level for now) my "pipeline" only consists on cleaning the files with tools like trimmomatic or Fastp before feeding them to the main classification tool.

I feel like this is very light and was wondering what I could do to improve it. I was thinking about doing some assembly beforehand (I'm currently trying ABYSS and SPADES). I also noticed that people are suggesting doing OTUs but I don't really understand why grouping them, especially knowing that some bacterial species have more than 98% ANI on their 16S.

What do I miss about OTUs, and what can I do to improve my classifcation ?

EDIT: I found this document to be really interesting as it covers a lot of things.

rna-seq classifcation 16s • 176 views
ADD COMMENTlink modified 3 months ago by Kevin610 • written 3 months ago by Sus10

Which region do you've for 16S?
Also, there's no assembly needed for 16S data.
Do you've Illumina data?

ADD REPLYlink written 3 months ago by Bioinformatics_NewComer320

Right now I have illumina data targeting V3 & V4

ADD REPLYlink written 3 months ago by Sus10
gravatar for leaodel
3 months ago by
leaodel60 wrote:

Hi Sus, I would try the BMP pipeline, they provide a detailed pipeline that you can use as it is or construct yours from theirs. You can find a detailed guide for rRNA 16/18 and ITS data analysis. Good luck!

ADD COMMENTlink written 3 months ago by leaodel60
gravatar for Kevin
3 months ago by
Kevin610 wrote:

one point of reference/ comparison is running the 16s workflow in using their supplied demo data. Their panel uses more regions than your data, but the data visualisations should inspire you to do more.

i don't think de novo assembly will add new information but i might be wrong.

if you look at the results u might find that genus level differentiation is pretty good for 16s data. if u are looking to differentiate to species level that's a different story.

ADD COMMENTlink written 3 months ago by Kevin610
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1208 users visited in the last hour