Question: Question About The Choice Of Reference Genomes For Mapping Metagenomics Reads
gravatar for Monzoor
7.4 years ago by
Monzoor280 wrote:

My first question is what are the uses/applications of reference genome mapping?

Second, assuming that I have sequenced an environmental sample for metagenomic analysis, since I have no idea of its taxonomic composition, how would I decide which genome I should use as a reference for mapping my reads?

genome metagenomics mapping • 2.1k views
ADD COMMENTlink modified 4.3 years ago by Biostar ♦♦ 20 • written 7.4 years ago by Monzoor280

Hi Monzoor. Here are a few tips for writing better questions. 1) Describe what you are doing. For people to help you, they must have a good understanding of what you are trying to accomplish. 2) Describe what you have tried so far, so that you don't get answers that tell you to do stuff you have already done. 3) Ask a specific question, including a question mark! This seems obvious, but I had to add 3 question marks to your questions and remove one from a sentence that was clearly not a question. 4) In order to get informative answers, you must write informative and well formated questions. :)

ADD REPLYlink written 7.4 years ago by Eric Normandeau9.7k

And you can edit, as soon as you have the time, your question (i.e. this question) to take Eric's suggestions into account.

ADD REPLYlink written 7.4 years ago by Nicojo1.1k

I agree. Will be careful next time.

ADD REPLYlink written 7.4 years ago by Monzoor280
gravatar for Michael Dondrup
7.4 years ago by
Bergen, Norway
Michael Dondrup44k wrote:

Hi Monzoor, answers from my experience, hope they help:

1)In my understanding single reference genome mapping in used, when you have sequence reads that can be linked to a single (or very few reference genomes). Applications for reference genome mapping are among others:

  • re-sequencing of genomic DNA
  • genome sequencing of individuals for variant detection (SNPs, copy number variations)
  • sequencing of new species closely related to reference to guide assembly
  • RNA-seq
  • ChIP-seq

2) Metagenomics was not in that list. You named the reason for this already, you cannot know what you will see in that complex mixture of DNA. As you cannot decide a priori, the best bet is to BLAST against all available sequences. For example, use the full NT Blast database. Often, metagenomics is interested in microbial communities, so you could restrict the search to prokaryote sequences, while neglecting any eukaryote contribution or contamination.

Reference genome mapping can be useful even in metagenomics when you for example find a predominant taxonomica assignment in the first step. Then you can try map reads or assemble them using the predominant taxon/organism. We tried this for example in an analysis of the metagenome of a biogas plant and there other publications doing something similar.

Citing the abstract as an example:

A significant portion of contigs was allocated to the genome sequence of the archaeal methanogen Methanoculleus marisnigri JR1. Mapping of single reads to the M. marisnigri JR1 genome revealed that approximately 64% of the reference genome including methanogenesis gene regions are deeply covered.

ADD COMMENTlink written 7.4 years ago by Michael Dondrup44k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1929 users visited in the last hour