Question: Bacteria identification using 16s rRNA sequencing data
gravatar for fxiao1
4.7 years ago by
fxiao10 wrote:

I would like to identify as many bacteria as possible from the 16s rRNA sequencing data. I found more than 60% of the reads can be aligned to multiple bacteria species. I don't think I should ignore them. I try to assign them to specific species according to the count distribution of the 40% of the reads. Does this make sense? Is there any protocol to follow in this field? Thanks.

rna-seq alignment • 2.6k views
ADD COMMENTlink modified 4.7 years ago by dago2.6k • written 4.7 years ago by fxiao10

NB. Tag should be amplicon-seq or 16S, not RNA seq.

ADD REPLYlink written 4.7 years ago by Daniel3.8k
gravatar for jb
4.7 years ago by
jb30 wrote:

What you don't describe is what region of the 16S gene you amplified. Your ability to discriminate "different" representative sequences at different taxonomic levels depends upon the region sequenced. You should use highly curated databases to determine what you have (such as what is found associated with MOTHUR and /or qiime - which are software specifically designed for the purpose of analyzing 16S sequences). Mis-identification can arise from errors/missing data from your sequences as well as errors/missing data in the database you are using. Most likely those that hit to multiple species won't be able to discriminate at the species taxonomy, but at a higher level - like family. .

ADD COMMENTlink written 4.7 years ago by jb30
gravatar for Daniel
4.7 years ago by
Cardiff University
Daniel3.8k wrote:

Check out the qiime tutorial, it's pretty thorough and should allow you to do everything you need. Notably, you typically don't view the data at species level as this is very varied, but at the genus or family, which is built in to this kind of analysis.

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by Daniel3.8k
gravatar for dago
4.7 years ago by
dago2.6k wrote:

I think that for 16S study there is not better option than the resources offered by SILVA. They rely on a manually curated database that has been extensively used and considered to be a golden standard for bacterial phylogeny.

Hope it helps

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by dago2.6k
gravatar for Istvan Albert
4.7 years ago by
Istvan Albert ♦♦ 85k
University Park, USA
Istvan Albert ♦♦ 85k wrote:

Here is how the MEGAN tool does it:

The main problem addressed by MEGAN is to compute a “species profile” by assigning the reads from a metagenomics sequencing experiment to appropriate taxa in the NCBI taxonomy. At present, this program implements the following naive approach to this problem:

  1. Compare a given set of DNA reads to a database of known sequences, such as NCBI-NR or NCBI-NT [3], using a sequence comparison tool such as BLAST [1].
  2. Process this data to determine all hits of taxa by reads.
  3. For each read r, let H be the set of all taxa that r hits.
  4. Find the lowest node v in the NCBI taxonomy that encompasses the set of hit taxa H and assign the read r to the taxon represented by v. We call this the naive LCA-assignment algorithm (LCA = “lowest common ancestor”). In this approach, every read is assigned to some taxon.
ADD COMMENTlink written 4.7 years ago by Istvan Albert ♦♦ 85k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1198 users visited in the last hour