Question: Is it correct to use contigs instead of short-reads in the whole genome mapping/alignment and variant calling?
0
gravatar for vassialk
4.3 years ago by
vassialk190
Belarus
vassialk190 wrote:

Is it correct to use contigs instead of short-reads in the whole genome mapping/alignment and variant calling? Want differences in results to expect?

I am using NextGene, DNAStar, Ugene and BBMap with SAM and VCF tools for tuberculosis WGS samples.

ADD COMMENTlink modified 4.3 years ago • written 4.3 years ago by vassialk190
0
gravatar for DG
4.3 years ago by
DG7.1k
DG7.1k wrote:
  1. It depends on what you are trying to do
  2. It REALLY depends on what you are trying to do

Given the question I am assuming you are doing de novo assembly to generate contigs from short reads and then you want to map the contigs against some sort of reference sequence? You can certainly do that, but again, it really depends on the question you are asking and what you are trying to do. As well as the state of reference sequences for your organism, and the expected variability. Especially with bacteria if there is high inter-strain variability most people treat strains as independent and do de novo assembly on the bacterial genomes versus reference-based mapping. It is always a good idea to read lots of papers in your area of exploration to get an idea of what people ion your sub-field are typically doing. You might chose to do something completely different and unique, but you want to know what is typically being done in order to justify your choices. Whether it is to do something fairly standard and typical in the workflow, or something very different.

ADD COMMENTlink modified 3 months ago by RamRS25k • written 4.3 years ago by DG7.1k
1

Note that most assemblers will pick one letter or the other if your sample is heterozygous/non-clonal, so you will miss a number of those variants

ADD REPLYlink written 4.3 years ago by swbarnes27.5k

To add to this, you also lose some information when mapping contigs instead of reads - specifically, the depth and quality values of reads, which can be helpful in determining the probability of correctness of variant calls. However, contigs can give you a better ability to find larger-scale structural differences such as long insertions.

ADD REPLYlink modified 3 months ago by RamRS25k • written 4.3 years ago by Brian Bushnell17k

Thank you, your words are helpful, in papers people report using command line tools and short reads, as far as I am informed

ADD REPLYlink written 4.3 years ago by vassialk190

yes, basically one trades one type of error (snp calling errors) for another type of error (assembly errors). Sometimes that may work well, other times it won't.

ADD REPLYlink written 4.3 years ago by Istvan Albert ♦♦ 82k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 668 users visited in the last hour