Entering edit mode
7.2 years ago
andrewdavis3
•
0
Does anyone know why when doing a de novo assembly of a bacterial isolate and then mapping back the same reads to the contigs, there are times when there is one or two nucleotides(out of 5 million) that mapping will call it completely opposite of the contig, for example the contig will say at position 25 there is G and the pileup file from mapping will say that at position 25 all the reads are a C. I assuming its an issue with repeat regions but don't have any evidence to tell me thats the reason why. I have seen it using velvet and spades as the de novo assemblers and bowtie2 and bwa as the mappers.
I have not noticed this, but I suggest you use mapping results when mapping and assembly differ. Kmer-based assembly generally does not take into account entire read-length information, while mapping does; in fact, mapping tends to try to optimize alignment of both reads in a pair, while kmer-assembly tends to care only about length-K substrings of a single read.